Set up Prod/Stg/Sandbox environment setup #342

maxachis · 2024-06-27T00:41:39Z

Discussed previously in #340, here is the implementation!

TODO

Create new Sandbox database in Digital Ocean, with same specs as pdap-db-dev (which unfortunately cannot be easily renamed).
Update prod-to-dev-migration repository to simultaneously construct the Stage and Sandbox databases, with Stage using the data from Production, and Sandbox using the Schema from Production.
Update automation.pdap.io with environmental variables necessary to ensure proper functionality in refresh, modify configuration as needed
Rename Prod-To-Dev-Migration job in automation.pdap.io to reflect its new behavior.
Develop dummy data rows to populate sandbox database, modify prod-to-dev-migration repository to automatically add these to the sandbox database on migration. This website might be helpful.
Add documentation in Notion describing how everything works.

Optional/Debatable TODO

To maintain clarity, destroy pdap-db-dev and build new database pdap-db-stg, redirecting all stage connections to this database.
Rename variables in run_pytest.yaml (as well as in associated Python and Github configuration components) to accurately reflect renaming. Not necessary, but will save us confusion later on.
Create new issue for providing instructions and resources for building a local version of the database for personal testing purposes, and for populating with dummy data.

The text was updated successfully, but these errors were encountered:

maxachis · 2024-06-28T00:28:18Z

automation.pdap.io now has two builds: One for Sandbox, and one for Stage. They should occur around the same time, but this allows us to be more flexible if we need to in the future.

Next up is the question of dummy data.

In some cases, we may not need any dummy data and can simply use data from the tables -- tables such as zip_codes and state_names don't contain any sensitive information, so we could just use those.

Then there are some cases where we simply don't need to include all that data -- for example, quick_search_query_logs, as previously discussed, has an oversized amount of content that we don't need and which will slow things down.

We also want to be mindful that our schemas will change, so the more dummy data we believe we should possess, the more dummy data to modify if we adjust our schemas.

Additionally, some newly-created tables, such as requests_v2, do not have any data at all, and adding some dummy data will help us test it out ahead of bringing it into production.

For now, I think we can get away with a handful of dummy data in sensitive tables such as users, access_tokens, session_tokens, and data_requests, empty tables such as requests_v2, and importing en-masse from other tables, while ignoring quick_search_query_logs.

maxachis · 2024-06-28T01:18:31Z

Scripts modified so that sandbox database has non-sensitive data added from production. Next up is dummy data.

Dummy Data TODO

Create dummy_data folder in repository, with empty csvs for each table to fill with dummy data
Modify sandbox script to load dummy data from csvs
Fill csvs with at least one row each to confirm functionality
Add Python setup info to ensure proper functionality with python script
Confirm functionality works in separate branch when called from automation.pdap.io
Merge into main

maxachis · 2024-06-28T14:34:39Z

Additionally, in the course of developing this logic, I came to the conclusion that Python would be preferable to shell scripts, and made an issue accordingly.

maxachis · 2024-06-29T00:27:45Z

@josh-chamberlain I'll need to be made a member of the Notion workspace to add a "Testing" page with information on the Stage and Sandbox databases ⛑️

josh-chamberlain · 2024-07-01T20:02:38Z

Nice work! I like your optionals, too.

maxachis self-assigned this Jun 27, 2024

maxachis added documentation Improvements or additions to documentation database v2 For v2 release Github Action Involves the creation or modification of Github Actions devops labels Jun 27, 2024

maxachis mentioned this issue Jul 11, 2024

Determine protocol for developing/testing database schema changes #340

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set up Prod/Stg/Sandbox environment setup #342

Set up Prod/Stg/Sandbox environment setup #342

maxachis commented Jun 27, 2024 •

edited

Loading

maxachis commented Jun 28, 2024 •

edited

Loading

maxachis commented Jun 28, 2024 •

edited

Loading

maxachis commented Jun 28, 2024

maxachis commented Jun 29, 2024

josh-chamberlain commented Jul 1, 2024

Set up Prod/Stg/Sandbox environment setup #342

Set up Prod/Stg/Sandbox environment setup #342

Comments

maxachis commented Jun 27, 2024 • edited Loading

TODO

Optional/Debatable TODO

maxachis commented Jun 28, 2024 • edited Loading

maxachis commented Jun 28, 2024 • edited Loading

Dummy Data TODO

maxachis commented Jun 28, 2024

maxachis commented Jun 29, 2024

josh-chamberlain commented Jul 1, 2024

maxachis commented Jun 27, 2024 •

edited

Loading

maxachis commented Jun 28, 2024 •

edited

Loading

maxachis commented Jun 28, 2024 •

edited

Loading