Skip to content

Cache initial results of the Samples page #180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 3, 2025
Merged

Conversation

qu8n
Copy link
Collaborator

@qu8n qu8n commented Mar 25, 2025

This PR made the following changes:

  • Cache the results of the initial queries on the Samples page (Zenhub ticket #1453)
  • Fix a bug that caused duplicate fetches when refreshing cache, leading to cache errors (Zenhub ticket #1449)
  • Add timestamps to console logs to ease future debugging

This PR reduces the % memory usage from 44.5 to 43.8 (seen in dev).

qu8n added 11 commits March 20, 2025 11:28
- Move the existing Oncotree cache object "one level deeper" to the
  value of the key "oncotree" inside the main node-cache object. This
  accomplishes two things: (1) organize/add more data into the cache
  e.g. samples query results, and (2) call the method that refreshes
  the Oncotree cache only once when the Oncotree cache expires.
  Previously, we had a bug where this method was called for every single
  Oncotree cache item, leading to numerous calls to our database and
  the Oncotree API
- Rename some instances of "Oncotree data" to "Oncotree API data" to
  make clearer Oncotree data from Neo4j vs. those from the API
- Remove the includeCancerTypeFieldsInSearch function inside oncotree.ts
  that is no longer used and needed
@@ -5,6 +5,7 @@ import { configureApp } from "./middlewares/configureApp";
import { configureSession } from "./middlewares/configureSession";
import { configureLogging } from "./middlewares/configureLogging";
import { configureRoutes } from "./routes";
require("log-timestamp"); // adds a timestamp to every log statement
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To help us better diagnose console errors of the Express app in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many parts in this file, especially around setting up of the Oncotree cache, comes from the old oncotree.ts. I renamed that file to this cache.ts to include all cache-related code.

See the current oncotree.ts for reference.

@qu8n qu8n marked this pull request as ready for review March 26, 2025 20:09
@qu8n qu8n requested a review from ao508 March 26, 2025 20:09
@qu8n
Copy link
Collaborator Author

qu8n commented Mar 26, 2025

@ao508 This PR has been deployed to the dashboard in dev for your review.

Copy link
Collaborator

@ao508 ao508 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the right approach is for addressing discrepancies because it is a little awkward.

My flow for updating sample 08039_JQ_6

  1. I edited a sample in the samples view (/samples)

As noted already in discussions, this update isn't going to show up on the /samples page until the cache refreshes
image

But it will show up if you search for the sample in the /samples view

image

And then if you were to immediately clear your search the table will go back to showing the outdated information

image

The options I can think of now are

  1. Put a pin in this issue for now and just communicate with the PMs that this discrepancy is going to exist for the time being
  2. Find a way to inject the update(s) to the sample(s) in the existing cache so as to make it seem as if the cache really is up to date
  3. Trigger a refresh when a sample update happens
  4. ?????

Quan's notes:
We discussed and aligned on option 2 as it seems like the most performant option.

@qu8n
Copy link
Collaborator Author

qu8n commented Mar 28, 2025

I'm not sure what the right approach is for addressing discrepancies because it is a little awkward.

My flow for updating sample 08039_JQ_6

  1. I edited a sample in the samples view (/samples)

As noted already in discussions, this update isn't going to show up on the /samples page until the cache refreshes image

But it will show up if you search for the sample in the /samples view

image

And then if you were to immediately clear your search the table will go back to showing the outdated information

image

The options I can think of now are

  1. Put a pin in this issue for now and just communicate with the PMs that this discrepancy is going to exist for the time being
  2. Find a way to inject the update(s) to the sample(s) in the existing cache so as to make it seem as if the cache really is up to date
  3. Trigger a refresh when a sample update happens
  4. ?????

Quan's notes: We discussed and aligned on option 2 as it seems like the most performant option.

@ao508 This has been fixed using option 2, deployed to dev, and tested with multiple sample updates.

@qu8n qu8n requested a review from ao508 March 28, 2025 17:03
Copy link
Collaborator

@ao508 ao508 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's very close to fixing that awkward behavior we noticed yesterday but still not quite there.

I updated the sample type for this sample to Normal and while the sample type field shows that update, the cmo sample name does not (I confirmed that the db entry did get an updated cmo sample name as a result of the update I made)

image

The issue is that we know what update the user wants to make on Submit ... but we technically don't know what the label is going to. be updated to (if it's updated).

Maybe this is a separate mechanism that we need to incorporate? Something that communicates that label update separately?

Actually while we're thinking about that.. what if the status and validation report become "failed"? I think we'd see a similar issue there..

We probably need to brainstorm this a bit more. One thing I can sorta imagine bridging this weird gap in data updates is to schedule a delayed query to neo4j with by the sample primary ids that we are expecting an update for?

Edit: To add - I think these are the only two pieces of info (cmoSampleName and status) that we're going to have to handle this way. Maybe we can even build a small query that just returns these two?

@qu8n qu8n requested a review from ao508 April 2, 2025 14:16
Copy link
Collaborator

@ao508 ao508 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

DashboardSample,
| "primaryId"
| "cmoSampleName"
| "historicalCmoSampleNames"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! I forgot about having to include this field in the query for latest updates.

@qu8n qu8n merged commit e317e22 into mskcc:master Apr 3, 2025
2 checks passed
@qu8n qu8n deleted the cache-samples branch April 3, 2025 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants