You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add support for Quantization in Neuron. Right now it's hard coded to Float32. There's a branch up right now that tries to generize this using generics but there's A LOT of changes. branch: quantization. Look into how mlx-swift handles this.
Add support for Quantization in Neuron. Right now it's hard coded to Float32. There's a branch up right now that tries to generize this using generics but there's A LOT of changes. branch: quantization. Look into how mlx-swift handles this.
Links:
https://huggingface.co/docs/optimum/en/concept_guides/quantization
https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization
The text was updated successfully, but these errors were encountered: