Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated caching strategy #576

Open
wants to merge 33 commits into
base: master
Choose a base branch
from
Open

Updated caching strategy #576

wants to merge 33 commits into from

Conversation

AaDalal
Copy link
Contributor

@AaDalal AaDalal commented Feb 4, 2024

No description provided.

Copy link
Contributor

@el-agua el-agua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

More comments would be helpful? might be a more efficient way that doesn't require looping through all topics but looks fine for now.


def test_one_course(self):
self.set_runtime_option()
self.precompute_reviews()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't matter for functionality, but you can also assert error/success responses with the self.out defined

backend/review/models.py Show resolved Hide resolved
Copy link
Contributor Author

@AaDalal AaDalal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits / namign things, only 1-2 substantive comments.

Feel free to ignore the nits if you don't have time to fix -- they aren't meant to slow you down!

"course-reviews",
"CIS-1200",
{"hello": "world"}, # No cache miss
query_params={"semester": OLD_SEMESTER},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shiva-menta we should make sure this is intended behavior

@@ -277,6 +278,7 @@ def handle(self, *args, **kwargs):

print("Recomputing Section.has_reviews...")
recompute_has_reviews()
precompute_pcr_views(True, True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe make these keyword args

if verbose:
print("Now precomputing PCR reviews.")

CachedReviewResponse.objects.all().update(expired=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not in an atomic block, so what's the potential time in which the cached responses are down totally. Is that acceptable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVM i see what you're doing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expired CachedReviewResponses can still be returned from the view

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but I think there's no reason to not have this in an atomic block either – will change.

objs_to_update.append(response_obj)
except Http404:
logging.info(
f"Topic returned 404 (topic_id {topic_id}, "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is a 404 an expected behavior? if so, logging.info makes sense; if not logging.error might be better so it looks correct on our sentry

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iirc there are some topics that have a course with no reviews and that leads to a 404 on our current implementation (can't remember if this is the exact error, but it's something similar). So yeah would say it can be expected behavior on certain courses.


if topic_id in topic_set:
try:
has_count += 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: usually try to minimize the stuff inside the try block to what is absolutely going to throw an error (for debug/code readability reasons)

print("Now precomputing PCR reviews.")

CachedReviewResponse.objects.all().update(expired=True)
responses = CachedReviewResponse.objects.all()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I got a little confused bc this is called responses. maybe cached_responses is a better name?


CachedReviewResponse.objects.all().update(expired=True)
responses = CachedReviewResponse.objects.all()
topic_set = {response.topic_id: response for response in responses}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since this is a dict, and also is a mapping only for cached responses (not for all topics) it might be good to rename it

to a JSON object storing summarized course review data (all the data frontend uses to display
reviews).
"""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think topic_id should be unique=True here (theoretically our loading script should enforce this, but I think its good to specify here too?)

topic = recent_course.topic
course_id_list = list(topic.courses.values_list("id"))
topic_id = ".".join([str(id[0]) for id in sorted(course_id_list)])
cache.set(course_code, topic_id, MONTH_IN_SECONDS)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we set the cache manually?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of using the decorator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also we shpuld make sure this cache is cleared every time we reimport and/or that the TTL is short enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache is set manually because we want some control over setting the course to topic mappings and the topic to response mappings. Using the decorator, since the routes are based on course, would only allow for course to response mapping, which can be a bit space inefficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup the relevant cache entries are cleared in the precompute_pcr_reviews script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants