Skip to content

feat: add permanent environment document cache #5187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Apr 11, 2025

Conversation

matthewelwell
Copy link
Contributor

@matthewelwell matthewelwell commented Mar 5, 2025

Changes

This PR sets out to add a permanent cache of the environment document, similar to how our SaaS environment works where it writes the document to dynamodb. This updates the current logic for caching the environment document, but hooks it into the same logic that updates the document in dynamodb for our SaaS platform. It also makes the cache backend customisable to allow self hosted installs to use, e.g. redis to cache the document.

This PR also adds a new metric called flagsmith_environment_document_cache_result which tracks the number of cache hits and misses for the environment document cache.

How did you test this code?

Added a new test to verify the flow end-to-end. I also tested it manually, as can be seen in this comment.

Copy link

vercel bot commented Mar 5, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 11, 2025 1:07pm
2 Skipped Deployments
Name Status Preview Comments Updated (UTC)
flagsmith-frontend-preview ⬜️ Ignored (Inspect) Visit Preview Apr 11, 2025 1:07pm
flagsmith-frontend-staging ⬜️ Ignored (Inspect) Visit Preview Apr 11, 2025 1:07pm

@github-actions github-actions bot added api Issue related to the REST API feature New feature or request labels Mar 5, 2025
Copy link
Contributor

github-actions bot commented Mar 5, 2025

Uffizzi Preview deployment-61569 was deleted.

Comment on lines 294 to 308
if is_saas():
environment_wrapper.write_environments(environments)

if (
project.edge_v2_environments_migrated
and environment_v2_wrapper.is_enabled
): # type: ignore[union-attr]
environment_v2_wrapper.write_environments(environments)
elif not settings.EXPIRE_ENVIRONMENT_DOCUMENT_CACHE:
environment_document_cache.set_many(
{
e.api_key: map_environment_to_environment_document(e)
for e in environments
}
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe the is_saas() check belongs here. I'd rather allow our users to employ their Dynamo in the future, e.g., by writing a Dynamo cache backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I somewhat agree with this. Let me see if I can refactor this (and the environment variables as per the above conversation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into this a bit more, I think my concern here is that this implies a much larger refactor.

I guess the ideal solution would be to refit our SaaS platform to essentially have something like:

ENVIRONMENT_DOCUMENT_CACHE_MODE: PERSISTENT
ENVIRONMENT_DOCUMENT_CACHE_BACKEND: cache.backends.custom.dynamodb

where cache.backends.custom.dynamodb is our own implementation of the django cache backend which handles the interactions with dynamodb.

While the prospect of this certainly excites me, I don't think it should necessarily be in scope for this piece of work.

An option to solve your direct comment here would be to remove the call to is_saas() and instead use some other setting, perhaps derived from whether e.g. ENVIRONMENTS_TABLE_NAME_DYNAMO is set.

Thoughts @khvn26?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually found a half decent halfway solution as there was already a conditional check in the code that was only relevant to the dynamodb path, so I've just used that as the check instead of is_saas for now. I still think the custom cache backend would be the best option, and may well simplify a lot of the dynamo integration code.

@github-actions github-actions bot added feature New feature or request and removed feature New feature or request labels Mar 7, 2025
@github-actions github-actions bot added feature New feature or request and removed feature New feature or request labels Mar 13, 2025
@github-actions github-actions bot added feature New feature or request and removed feature New feature or request labels Mar 13, 2025
@github-actions github-actions bot added feature New feature or request and removed feature New feature or request labels Mar 17, 2025
Copy link

codecov bot commented Mar 17, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.54%. Comparing base (3a98aca) to head (fd0f9d3).
Report is 7 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff            @@
##             main    #5187    +/-   ##
========================================
  Coverage   97.53%   97.54%            
========================================
  Files        1225     1227     +2     
  Lines       42551    42676   +125     
========================================
+ Hits        41503    41629   +126     
+ Misses       1048     1047     -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions bot added feature New feature or request and removed feature New feature or request labels Mar 17, 2025
@matthewelwell
Copy link
Contributor Author

matthewelwell commented Mar 21, 2025

More testing is needed here, but initial signs are good regarding performance. I tested this code by building the docker image locally and using the following (simplified for brevity) docker compose file.

volumes:
  pgdata:

services:
  postgres:
    image: postgres:15.5-alpine
    environment:
      POSTGRES_PASSWORD: password
      POSTGRES_DB: flagsmith
    container_name: flagsmith_postgres
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ['CMD-SHELL', 'pg_isready -d flagsmith -U postgres']
      interval: 2s
      timeout: 2s
      retries: 20
      start_period: 20s

  flagsmith:
    image: flagsmith/flagsmith:local
    environment:
      DATABASE_URL: postgresql://postgres:password@postgres:5432/flagsmith

      ENVIRONMENT: production # set to 'production' in production.
      DJANGO_ALLOWED_HOSTS: '*' # Change this in production
      ALLOW_ADMIN_INITIATION_VIA_CLI: 'true' # Change this in production
      FLAGSMITH_DOMAIN: localhost:8000 # Change this in production
      DJANGO_SECRET_KEY: secret

      # Enable Task Processor
      TASK_RUN_METHOD: TASK_PROCESSOR

      ENVIRONMENT_DOCUMENT_CACHE_MODE: PERSISTENT
      ENVIRONMENT_DOCUMENT_CACHE_BACKEND: django.core.cache.backends.redis.RedisCache
      ENVIRONMENT_DOCUMENT_CACHE_LOCATION: redis://redis_db:6379/1

    ports:
      - 8000:8000
    healthcheck:
      test: ['CMD-SHELL', 'python /app/scripts/healthcheck.py']
      interval: 2s
      timeout: 2s
      retries: 20
      start_period: 20s
    depends_on:
      postgres:
        condition: service_healthy
      redis_db:
        condition: service_started

  flagsmith_processor:
    image: flagsmith/flagsmith:local
    environment:
      DATABASE_URL: postgresql://postgres:password@postgres:5432/flagsmith
      USE_POSTGRES_FOR_ANALYTICS: 'true'

      LOG_LEVEL: DEBUG

      ENVIRONMENT_DOCUMENT_CACHE_MODE: PERSISTENT
      ENVIRONMENT_DOCUMENT_CACHE_BACKEND: django.core.cache.backends.redis.RedisCache
      ENVIRONMENT_DOCUMENT_CACHE_LOCATION: redis://redis_db:6379/1
    depends_on:
      flagsmith:
        condition: service_healthy
    command: run-task-processor

  redis_db:
    image: redis:latest

In order to compare performance, I simply removed the ENVIRONMENT_DOCUMENT_X environment variables from the flagsmith container and restarted the docker compose service.

Here are the before and afters:

Before

❯ wrk http://localhost:8000/api/v1/environment-document/ -H 'X-Environment-Key: ser.XXX'

Running 10s test @ http://localhost:8000/api/v1/environment-document/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   154.93ms  338.61ms   1.69s    89.80%
    Req/Sec   128.30     21.90   151.00     91.67%
  2179 requests in 10.09s, 3.93MB read
Requests/sec:    216.05
Transfer/sec:    399.19KB

After

❯ wrk http://localhost:8000/api/v1/environment-document/ -H 'X-Environment-Key: ser.XXX'

Running 10s test @ http://localhost:8000/api/v1/environment-document/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    15.68ms  115.10ms   1.61s    98.57%
    Req/Sec     1.64k   366.56     2.22k    56.00%
  32616 requests in 10.01s, 63.83MB read
Requests/sec:   3259.35
Transfer/sec:      6.38MB

@github-actions github-actions bot added feature New feature or request and removed feature New feature or request labels Mar 21, 2025
@matthewelwell matthewelwell marked this pull request as ready for review March 21, 2025 17:42
@matthewelwell matthewelwell requested a review from a team as a code owner March 21, 2025 17:42
@matthewelwell matthewelwell requested review from gagantrivedi and removed request for a team March 21, 2025 17:42
@matthewelwell matthewelwell force-pushed the feat/environment-document-cache branch from 62b7632 to 7e9de60 Compare April 11, 2025 09:44
@github-actions github-actions bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels Apr 11, 2025
Comment on lines 434 to 437
flagsmith_environment_document_cache_result.labels(
api_key=api_key,
result=CACHE_HIT if cache_hit else CACHE_MISS,
).inc()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an argument to say that this metric incrementing should be done in Environment.get_environment_document above but I decided that it didn't really make sense to track cache hits / misses in an instance that isn't configured to use the cache.

@github-actions github-actions bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels Apr 11, 2025
@github-actions github-actions bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels Apr 11, 2025
@github-actions github-actions bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels Apr 11, 2025
@matthewelwell matthewelwell requested a review from khvn26 April 11, 2025 13:14
@matthewelwell matthewelwell merged commit 08e88c3 into main Apr 11, 2025
38 checks passed
@matthewelwell matthewelwell deleted the feat/environment-document-cache branch April 11, 2025 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Issue related to the REST API feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants