-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite shift-AND algorithm to enable compiler auto vectorization when GCC 14.1 becomes available. #171
Comments
GCC 14 is available on 2_34 manylinux images. Hurray! https://github.com/pypa/manylinux?tab=readme-ov-file#manylinux_2_34-almalinux-9-based |
Bioconda still uses GCC 13.3. As per the latest cutadapt build. |
Whoops I made a mistake. Auto vectorization is not yet possible when the implementation is correct. Oh well. I still need to refactor it. |
It works and the code is faster, but not nearly as fast as hand vectorized code. Since the algorithm is not too hard, it is much better to hand-vectorize in this code. It can be easily ported to ARM instructions when the need arises. |
See https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1AB9U8lJL6yAngGVG6AMKpaAVxYMQkgGykHAGTwGTAA5dwAjTGIQACZfAAdUBUJbBmc3Dy8EpJSBQOCwlkjouItMKxsBIQImYgJ0908fMorU6tqCfNCIqNjfBRq6hszmgY6uwuK%2BgEoLVFdiZHYOAFIYgGYg5DcsAGoV9ccB/EEAOgQD7BWNAEE1zYZt1z2Do4J0LCpzy%2Bu724A3VB4dD7W40BjoYwsJgEZAITAKCBoBgDXauIIEAAcxgIuwAVMQEQRiHhrLsBiSjKRyXgAF6YHHk4lBYDGehGAgIUi/Xa8mn0xkMHrEYyoKjGeIwghRFHc258tEY7ySRkEokksn8VwQqFMBQAawU1J5fPRgmVqsJFNJuKChF1BqNJt5ZoIFtxeLV1rJ4UI0MdcpuCuSAo9eOhsIQxgGJEw01BdwA7AAhZ27fjEXYQEMM20HAAiGgOyd2eFeQqKUVF4slBGlxBRxdLa1TMWT8ZWKbTCtd7vT8x1/v1Gn263z/e1kKHChWAFZk2XZ/mm93TUqVbitYO9fquKPx1upzuZ/O8Ps27suHPl%2BsS6uXevGYeHfqYvuJ9vHXOF%2BeSzFryu8oKmu5obh%2BR4Gus77PtO35ni2uzrABt4JveiqgYydoEC%2BI4FqWDD2rBp7IamQHAehbpgVhL57nh1FET%2BCFXkuxZob2VEEdhQ5vnRnEviejEXv%2BLG3mxj62nxQ5QbxhHHnBv6ISRvxiRhHq%2BgQQ64WOuzqQxi43qRQbkRRfZ4rpO60dp5lfqeCnMQZKmUaq1mvu%2BLkCfBQlKWRwHsc5fo7tJVkBTZgklkhImGY5fYAEpaeORaiT5PbibsMWWQlrHJSBTm4jFPHaYlhnGSZYExUFmVJX8RnkX5uJiISTDoAAnsYh7vkVaEZlmOaMtoBZFbs2ijo4TKUqy7LAJyTb9W2LYdl22XkXFI2vHhV5VSVCrpathzrVlNVbflu2OPtm1bby5UnWdUVLcBK2dqd2n0TunV3dttGJk944vQaG3FRdaU8V9MlcTu/7nRdV2PaDL5IZDW2utiElYKo77WkY379ZFaHbbhcR4S5RanhCmCqN5h0lTtazeITIXDnBpNo0xFOA0D5608FGkWYzqOs4D0ME1zQ7MQuTN2RTynvXyeBUFmD1xOBOH7CDX1pbRiswRZC0A2zuwRnC0ZEIScEAfO2PLnNY7/bjfKdsu0u8rL8s8ZrA4Qa5MNq4L3hK1JOu28BBtRjGJvESx5sAVb%2BY247KsO5TduJgnCr28pyccLMtCcLOvCeNwvCoJwo0KPMiyYOe6w8KQBCaJnszwk1vQQLM%2BogLOGj6Jwkh53XpBFxwvAKCAne1xwWizHAsBIGgLDxHQUTkJQs/z/Q0TIMAUhcHwdD1sPEDhH3vrMMQzWcNXx%2B1M1ADy4TaJg1jn7ws9sII18MLQZ/j7wWDhK4wCODELQYeBdSBYGhEYcQ38wF4EJNYPA/wER9zJg/Vw0on7kEEOUPutA8DhGIFfZwWA%2B7MhYBgxBxBwhJEwPmTAEDgC4KMHXWYVADDAAUAANTwJgAA7tfeIjAMH8EECIMQ7ApAyEEIoFQ6hoG6G3gYJhphjDmFweEYekBZioHiJUFEnAAC018oL6OOAWZAABOcxhcKEkiwBoluLQH66PsBCYYnht4BCCN0Ss0Rt6JGSLotxeh/G5AYBMYUehLBOLaIMeoLhGiRPKNEqosTwk%2BMibEoJ28xh1DSVMLgsxS4LCWBILOOde7QIHrsVQmJvD6OVLsYAyBkCXkkKcPcEBcCEBIJXApvAx4TwbpgJu0QHFtw7l3DgPdSD5y0P3TgQ8R412YWUjgMQKlzIHv0lZpAKHJDsJIIAA%3D
The text was updated successfully, but these errors were encountered: