-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add workflow to gen data #329
base: feat/pull-v2-api
Are you sure you want to change the base?
Changes from 244 commits
5622d8d
d6a9f31
c92051e
1a50f4e
3c2da8f
3dcccda
16dc26c
a6ce739
a43b445
2c00b68
85c3717
5ad4e61
1275bcf
a3a736e
11fdf8e
d9978eb
5fd7b06
d09f171
950273b
cc9a529
0b5894c
02bbf2a
b6e1745
bd80bed
09b0045
6525032
4453591
c0626b9
a1ea53c
664b3ad
5c6f418
1c5ea6a
5a13c67
f6d80ca
cde75ab
fc2fe1e
ac64ba0
f08c7c1
8b55cd9
d239c1f
9a64554
1ee8282
6ca18db
aacdec8
3e11a12
b883c21
565aede
a80b6e7
c9988aa
0dd3c03
9678551
a1812ff
062d4b1
82408d9
cf785f2
db8c504
c176028
48f6259
ad869ef
dde12b4
c33d743
f9f2654
23ff55b
84c3a26
a51ee6c
bb4354b
dea94a1
ac93903
fb456c7
929372a
b6dfe8e
ff5adfe
a2e99a0
8394fa3
809710d
f7cf802
2e0387d
b6b668e
f496e73
2eb24c6
c2a0f07
715d296
81d7d1b
9ff8b0e
6c9dcbf
80e07e3
8f42b3d
a1bfe9a
e297bd6
fb60d32
b2fb864
3c68082
05cee7c
fb33f7b
8076382
ea0c9ab
b6937a3
b3e95b7
7cc4317
c079642
bb6c874
a460ff8
7c707ec
3c9251e
248e6bd
ea1a077
dc829b0
ee2e56d
1cf25fa
c8c2d9a
020eb5b
e32d6e1
e4a9a67
82ededc
14f67b4
4ecd440
dca1bae
c0bbc32
70c9461
ea0229e
56834f0
599c72e
8d5684b
55192e0
6faac9d
4573670
ee00671
1db21ed
d13c972
7b14a9a
17a2f7c
788e22d
db58792
c1990cd
692b6ea
4bb1698
7d63c37
1cfbab6
075308e
463a32d
a3d11cb
40290cb
9664e2a
a0400b6
7113922
db38143
b763994
927d3de
a6fccb2
43dd66c
39423db
fbf2505
731686c
832b3db
6eb44d1
5806f1f
465b881
44d37b0
a5aec37
64aee65
f0c06e8
50bd68e
50ca568
c9a3b76
3abb21c
ff45680
c8b6f94
8b18b65
04c6375
82cb37a
8fb15e8
c9f4423
8bba528
46a5293
6296668
c9218b9
6c06142
2e4c960
9e190ae
678d721
93d718b
27881bf
17c6184
750deaf
e954c21
2c28ee3
50bbec0
7efe43d
40ff1e8
b5954a2
93ec624
9ec0ebd
794d934
43db2e8
e0ecb2b
d461b13
206f986
08b6fc2
12efe1d
0a3da68
969df8a
b86effd
a3cc106
06f0819
e37d1f2
881733e
a558379
0b160c9
a043a76
a828a72
f0c00fa
5006c28
1f54189
ffdc7e6
1e99f65
e7a67f3
8059e9f
c1e48a5
6a50288
8d65708
59b34c2
8adaf50
f961892
a79dce5
50055fb
567c967
87657f7
fecf2ec
ecc51fa
4c3f104
71695a9
35fa76d
af8f51d
bfd4b9a
3d8c799
ef9e09a
6d4cad4
e110b7e
0e6cca5
a6263c3
c83b157
fb4e811
f929a6e
d7163d6
3262819
b067517
b756009
8833de7
a1ff8f6
7986727
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,7 @@ | |
"service": "app", | ||
"workspaceFolder": "/workspace", | ||
"remoteUser": "vscode", | ||
"postCreateCommand": "bash ./.devcontainer/post-create-command.sh", | ||
//"postCreateCommand": "bash ./.devcontainer/post-create-command.sh", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove commented-out line There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
"postStartCommand": "git config --global --add safe.directory ${containerWorkspaceFolder}", | ||
"forwardPorts": [4567, 5432], | ||
"extensions": [ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,8 +26,10 @@ services: | |
network_mode: service:db | ||
|
||
db: | ||
#image: postgres:16.0-bullseye | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove commented-out lines There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
#image: postgres:latest | ||
image: postgres:15.4 | ||
#image: postgres:15.4 | ||
image: postgres:15.6-bullseye | ||
restart: unless-stopped | ||
volumes: | ||
- postgres-data:/var/lib/postgresql/data | ||
|
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,10 @@ | ||
#!/bin/bash | ||
|
||
set -e | ||
|
||
pip install --upgrade pip | ||
#pip install 'urllib3[secure]' | ||
pip install -r requirements.txt | ||
pip install -r gdrive_requirements.txt | ||
pip install -r download/requirements.txt | ||
sudo gem install pg bundler | ||
sudo bundle install |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,103 @@ | ||
# This workflow will later be replaced with logic to "Generate Website Data" | ||
# The verify-gdrive.yml workflow file will be renamed to this one | ||
# We have to introduce this change in steps because GitHub gets confused until | ||
# we add the new workflow file to the master branch | ||
name: "Generate Website Data" | ||
on: | ||
workflow_dispatch: | ||
push: | ||
jobs: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we really want this to run on every single push? Do we have unlimited GH Actions hours? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have not heard of limited hours. There is a concept of throttling. So if we are a highly active bunch, that may lead to problems. Otherwise, we're ok. I also have some checks to limit what is run. If we want to, we can limit jobs/steps based on the files that were changed. For now, I'm keeping it mostly simple, except for the piece that builds the docker image (which only occurs when required). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Found this, "GitHub Actions usage is free for standard GitHub-hosted runners in public repositories". It's amazing how much compute & storage GH gives away for free 😄 |
||
generate: | ||
build: | ||
runs-on: ubuntu-latest | ||
env: | ||
REPO_OWNER: ${{ github.repository_owner}} | ||
REPO_BRANCH: ${{ github.ref_name }} | ||
SERVICE_ACCOUNT_KEY_JSON: ${{ secrets.SERVICE_ACCOUNT_KEY_JSON }} | ||
GDRIVE_FOLDER: ${{ vars.GDRIVE_FOLDER }} | ||
outputs: | ||
devcontainer: ${{ steps.filter.outputs.devcontainer }} | ||
noncontainer: ${{ steps.filter.outputs.noncontainer }} | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- run: pip install -r gdrive_requirements.txt | ||
- run: python test_pull_from_gdrive.py | ||
- name: Archive pulled files | ||
uses: actions/upload-artifact@v3 | ||
- name: Get changed files | ||
id: changed-files | ||
uses: tj-actions/changed-files@v40 | ||
- name: List all changed files | ||
id: filter | ||
run: | | ||
devcontainer=false | ||
noncontainer=true | ||
for file in ${{ steps.changed-files.outputs.all_changed_files }}; do | ||
echo "$file was changed" | ||
if [[ ${{github.event_name}} = push ]]; then | ||
if [[ $file = .devcontainer* ]]; then | ||
devcontainer=true | ||
elif [[ $file = *requirements.txt* ]]; then | ||
devcontainer=true | ||
elif [[ $file = Gemfile* ]]; then | ||
devcontainer=true | ||
fi | ||
fi | ||
done | ||
|
||
echo "devcontainer=$devcontainer" >> $GITHUB_OUTPUT | ||
echo "noncontainer=$noncontainer" >> $GITHUB_OUTPUT | ||
- name: Login to GitHub Container Registry | ||
uses: docker/login-action@v3 | ||
with: | ||
name: redacted-netfile-files | ||
path: .local/downloads | ||
registry: ghcr.io | ||
username: ${{github.actor}} | ||
password: ${{secrets.GITHUB_TOKEN}} | ||
- name: Build dev container | ||
if: steps.filter.outputs.devcontainer == 'true' | ||
run: | | ||
docker build --no-cache --tag ghcr.io/caciviclab/disclosure-backend-static/${{github.ref_name}}:latest -f ./.devcontainer/Dockerfile . | ||
docker push ghcr.io/caciviclab/disclosure-backend-static/${{github.ref_name}}:latest | ||
- name: Check code changes | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are tags immutable on GHCR? If not, why not use github.ref_name as the tag instead of a separate "sub-repo" for every commit? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I understand your first question, I think you're asking why we even use 'latest' instead of just using the branch name. The reason is mainly convention. Typically, there's a version number or string, but I don't want to generate a new version on every commit. I just want to replace the latest. "GitHub Packages usage is free for public packages." I don't think there's a limit and they actually don't let you delete it if it's popular. Also, we are creating an image per branch and not per commit. It gets overwritten per branch. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a small thing, but to clarify my first question, in the form |
||
if: steps.filter.outputs.noncontainer == 'true' | ||
run: | | ||
echo "TODO: run test to verify that code changes are good" | ||
generate: | ||
needs: build | ||
if: needs.build.outputs.noncontainer == 'true' | ||
runs-on: ubuntu-latest | ||
container: | ||
image: ghcr.io/caciviclab/disclosure-backend-static/${{github.ref_name}}:latest | ||
credentials: | ||
username: ${{ github.actor }} | ||
password: ${{ secrets.github_token }} | ||
env: | ||
REPO_OWNER: ${{ github.repository_owner}} | ||
REPO_BRANCH: ${{ github.ref_name }} | ||
SERVICE_ACCOUNT_KEY_JSON: ${{ secrets.SERVICE_ACCOUNT_KEY_JSON }} | ||
GDRIVE_FOLDER: ${{ vars.GDRIVE_FOLDER }} | ||
PGHOST: postgres | ||
PGDATABASE: disclosure-backend | ||
PGUSER: app_user | ||
PGPASSWORD: app_password | ||
services: | ||
postgres: | ||
#image: postgres:9.6-bullseye | ||
image: postgres:15.6-bullseye | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove commented-out line There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm trying to remind myself that the postgres versions don't match up with travis-ci and so we have migrate travis-ci. I think I ran into some problem trying to set up 9.6. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Gotcha. Maybe a note about it n the comment then? |
||
env: | ||
POSTGRES_USER: app_user | ||
POSTGRES_DB: disclosure-backend | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be nice to have these in top-level env vars in the workflow so they can be shared between the two places they get used. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
POSTGRES_PASSWORD: app_password | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- name: Check setup | ||
run: | | ||
git -v | ||
# This keeps git from thinking that the current dir is not a repo even though a .git dir exists | ||
git config --global --add safe.directory "$GITHUB_WORKSPACE" | ||
psql -l | ||
echo "c1,c2" > test.csv | ||
echo "a,b" >> test.csv | ||
cat test.csv | ||
csvsql -v --db postgresql:///disclosure-backend --insert test.csv | ||
echo "List tables" | ||
psql -c "SELECT * FROM pg_catalog.pg_tables WHERE schemaname != 'pg_catalog' AND schemaname != 'information_schema';" | ||
|
||
pip show sqlalchemy | ||
- name: Create csv files | ||
run: | | ||
make clean | ||
make download | ||
make import | ||
make process | ||
- name: Summarize results | ||
run: | | ||
echo "List tables" | ||
psql -c "SELECT * FROM pg_catalog.pg_tables WHERE schemaname != 'pg_catalog' AND schemaname != 'information_schema';" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,15 +4,21 @@ on: | |
jobs: | ||
check: | ||
runs-on: ubuntu-latest | ||
container: | ||
image: ghcr.io/caciviclab/disclosure-backend-static/${{github.ref_name}}:latest | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens here if no image has been pushed for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should fail. Good question, however. I've updated main.yml to build the image as soon as a new branch is created. Since the above is a manually run workflow, you would have to be quick to run it if you wanted to generate the failure. |
||
credentials: | ||
username: ${{ github.actor }} | ||
password: ${{ secrets.github_token }} | ||
|
||
env: | ||
REPO_OWNER: ${{ github.repository_owner}} | ||
REPO_BRANCH: ${{ github.ref_name }} | ||
SERVICE_ACCOUNT_KEY_JSON: ${{ secrets.SERVICE_ACCOUNT_KEY_JSON }} | ||
GDRIVE_FOLDER: ${{ vars.GDRIVE_FOLDER }} | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- run: pip install -r gdrive_requirements.txt | ||
- run: python test_pull_from_gdrive.py | ||
- name: Test pull from gdrive | ||
run: python test_pull_from_gdrive.py | ||
- name: Archive pulled files | ||
uses: actions/upload-artifact@v3 | ||
with: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,4 +18,6 @@ cat <<-QUERY | psql ${database_name} | |
DELETE FROM "$table_name" | ||
WHERE "Tran_Date" is NULL; | ||
QUERY | ||
else | ||
echo | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do this? I can't see why the docker build would need to
pwd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed