You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the seamless_communication library and specifically its translator.predict() method, I've encapsulated the core inference logic into reusable interfaces. However, I've noticed that even after wrapping it up, multiple requests are not processed concurrently; instead, they execute sequentially, significantly impacting system throughput and response time.
Result: All tasks complete sequentially rather than concurrently.
Expected Behavior
I would expect to utilize parallel processing capabilities on multi-core GPU by running multiple translations concurrently.
Actual Results
Only one request is processed at a time, leading to increased overall execution time.
Can you please suggest modifications or configurations needed in my code so that predictor.translate() can handle concurrent translations effectively?
If you need any further information or clarification, feel free to reply. Looking forward to your assistance in optimizing our application performance!
Thanks!
The text was updated successfully, but these errors were encountered:
Problem Description
When using the
seamless_communication
library and specifically itstranslator.predict()
method, I've encapsulated the core inference logic into reusable interfaces. However, I've noticed that even after wrapping it up, multiple requests are not processed concurrently; instead, they execute sequentially, significantly impacting system throughput and response time.Relevant Code Snippet
Result: All tasks complete sequentially rather than concurrently.
Expected Behavior
I would expect to utilize parallel processing capabilities on multi-core GPU by running multiple translations concurrently.
Actual Results
Only one request is processed at a time, leading to increased overall execution time.
Can you please suggest modifications or configurations needed in my code so that predictor.translate() can handle concurrent translations effectively?
If you need any further information or clarification, feel free to reply. Looking forward to your assistance in optimizing our application performance!
Thanks!
The text was updated successfully, but these errors were encountered: