Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will the cache kv become invalid? #1099

Open
oslijunw opened this issue Apr 16, 2024 · 0 comments
Open

Will the cache kv become invalid? #1099

oslijunw opened this issue Apr 16, 2024 · 0 comments

Comments

@oslijunw
Copy link

In a multi-threaded situation, if the GPU server resources are insufficient, will cache kv preemption occur? For example, there are two conversations at the same time, both of which are long. If the two conversations are halfway through and conversation a cuts into conversation b, the cache kv in conversation b should be lost, that is, the cache kv of conversation a is used. Due to the involvement of gpu computing and insufficient resources, verification cannot be carried out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant