Stop large searches blocking small searches #1282

zabeen · 2024-04-16T18:44:09Z

We already did some work on this last year for match prediction: #944. It worked when we have one large search run before a small search.

However, in production we have encountered cases of multiple large searches being run in parallel, at which point our existing fix seems to no longer be effective, probably as there is an upper limit to how much we can horizontally scale. So we need to consider other ways we can optimise internal queuing of requests/batches to prevent this problem.

Reduce MatchPredictionProcessingBatchSize on LIVE-WMDA-ATLAS-FUNCTIONS #1283
Implement "smart" batching in match prediction to maximise benefit of horizontal scaling #736

Possible Ideas

Note: if we improve efficiency of single searches, as covered by #1280, then we may not need to consider further queue management optimisations. Below are just thoughts I want to capture somewhere and not necessarily ideas to be executed. 😄

Limit parallelism of mismatch requests

We know that mismatch Adult searches bring back significantly more donors than 10/10 searches, and these are the matching requests that tend to fail when the app service plan is under load.
Users at WMDA are required to run 10/10 search before mismatch search.
- If 10/10 brings back >40K results, they are not allowed to submit a mismatch search.
Could we place a limit on how many mismatch Adult searches are run in parallel?
1. 10/10 search is the first search for a patient and thus more important to complete quickly.
2. Mismatch searches will result in more load on the system, and so we want to control how many mismatch searches are run in parallel.
Pros: the most urgent 10/10 searches will be prioritised and not blocked by resource-intensive mismatch searches.
Cons:
- If there happen to be many mismatch Adult searches submitted at once, the "mismatch queue" may become too long.
  - Can mitigate this problem by making the parallelism limit configurable
- I am not sure how to implement this within existing service bus topic architecture in a sensible way - needs HLD

The text was updated successfully, but these errors were encountered:

zabeen added the performance-search Covers changes to improve performance specifically *of search requests*. label Apr 16, 2024

zabeen added this to Atlas Development Apr 16, 2024

github-project-automation bot moved this to Project Backlog in Atlas Development Apr 16, 2024

zabeen mentioned this issue Apr 23, 2024

Does modifying parallelism improve search throughput? #1291

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop large searches blocking small searches #1282

Stop large searches blocking small searches #1282

zabeen commented Apr 16, 2024 •

edited

Loading

Stop large searches blocking small searches #1282

Stop large searches blocking small searches #1282

Comments

zabeen commented Apr 16, 2024 • edited Loading

Possible Ideas

Limit parallelism of mismatch requests

zabeen commented Apr 16, 2024 •

edited

Loading