Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Data Insert Logic - Commands #380

Open
mikeyavorsky opened this issue May 15, 2024 · 1 comment
Open

[FEATURE] Data Insert Logic - Commands #380

mikeyavorsky opened this issue May 15, 2024 · 1 comment
Labels

Comments

@mikeyavorsky
Copy link
Collaborator

mikeyavorsky commented May 15, 2024

Extend the Flask API to read from a .jsonl file of scraped Commands data and insert it into the Postgres DB.

Is your feature request related to a problem? Please describe.
As part of our data ETL pipeline, we need to transform data that has been scraped and loaded into an S3 bucket and load it into our database. This feature focuses on loading "commands" (aka employment.unit) data.

Describe the solution you'd like

  • Read data from the attached .jsonl file
  • Transform the data so that it conforms to our employment data schema
  • Verify that the employment data is not a duplicate of some data that was previously loaded (and update if changed)
  • Load the employment data into the database

Additional context
Data from a scraper is a requirement for this task. You can download that fixture here.

@mikeyavorsky mikeyavorsky changed the title [FEATURE] Read scraped data for Command and insert into Postgres [FEATURE] Data Insert Logic - Commands May 15, 2024
@mikeyavorsky
Copy link
Collaborator Author

50a.jsonl.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant