v0.1.5

github-actions released this 28 Oct 16:41

· 145 commits to main since this release

What's Changed

Only apply attention mask if seqlen is greater than 1 by @casper-hansen in #96
add gpt_neox support by @twaka in #113
[core] Support fp32 / bf16 inference by @younesbelkada in #121
Fix potential overflow by @casper-hansen in #102
Fixing starcoder based models with 15B by @SebastianBodza in #118
Support Aquila models. by @ftgreat in #123
Add benchmark of Aquila2 34B AWQ in README.md. by @ftgreat in #126

New Contributors

@twaka made their first contribution in #113
@younesbelkada made their first contribution in #121
@SebastianBodza made their first contribution in #118
@ftgreat made their first contribution in #123

Full Changelog: v0.1.4...v0.1.5

Contributors

twaka, ftgreat, and 3 other contributors

Assets 10