Replies: 1 comment 1 reply
-
模型不依赖 dynamic ntk 外推的情况下,可以保证32K-64K以内的性能。不依赖 lmdeploy 的情况下,主要取决于显存 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
描述问题
internlm2-chat-7b通过lmdeploy能够支持200k,但是想知道本身支持的token多大?
Beta Was this translation helpful? Give feedback.
All reactions