v0.1.5
What's Changed
- Only apply attention mask if seqlen is greater than 1 by @casper-hansen in #96
- add gpt_neox support by @twaka in #113
- [
core
] Support fp32 / bf16 inference by @younesbelkada in #121 - Fix potential overflow by @casper-hansen in #102
- Fixing starcoder based models with 15B by @SebastianBodza in #118
- Support Aquila models. by @ftgreat in #123
- Add benchmark of Aquila2 34B AWQ in README.md. by @ftgreat in #126
New Contributors
- @twaka made their first contribution in #113
- @younesbelkada made their first contribution in #121
- @SebastianBodza made their first contribution in #118
- @ftgreat made their first contribution in #123
Full Changelog: v0.1.4...v0.1.5