Open
Description
Feature request
support block size 32
Motivation
Recently many models can be better when quanted in block size 32 and many benchmarks are run in 32 as block size.
Several types of models, like vision models and image generators, are also more sensitive to block size, and 32 (or even 16) as block size can be better suited for those tasks.
Your contribution
If one point out where I should look at I can also PR. But I am not sure about compiling with different version