Further optimize gemm #13

MJChku · 2021-02-07T12:47:25Z

Thanks for your great work!

I plan to work on bnn optimization as well for various application (generative model/classifier) on a powerful cpu.
I did preliminary work for a few hours to change the "micro_kernel" to use avx512, and it showed 4x speed up for simple one loop optimization (note -O3 won't do the optimization to vectorize). I wonder if you plan to work on this further ? and boost the performance further.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Further optimize gemm #13

Further optimize gemm #13

MJChku commented Feb 7, 2021

Further optimize gemm #13

Further optimize gemm #13

Comments

MJChku commented Feb 7, 2021