Pre-processes provided CSV files for usage in production databases.
- Extrapolate inferenced data from CSV files
- Reformat SQL data to be NoSQL friendly
- Import NoSQL data into Firestore
- Use in resulting data in production
- Python 3.6+
- Grade data:
- in CSV format with the schema seen in
sample/sample.csv
- Real example data available at cougargrades/FOIA-IR05921
- in CSV format with the schema seen in
- Install pipenv
pipenv install
pipenv shell
$ ./csv2db.py foia/*.csv --out records.db
$ ./db2jsonl.py records.db --out db/
$ ./jsonl2firestore.py --key firebaseadminsdk.json --meta db/catalog_meta/meta.json db/catalog/*.jsonl
usage: csv2db.py [-h] [--out OUTFILE] grades.csv [grades.csv ...]
Pre-process CSV grade data into an intermediary database format.
positional arguments:
grades.csv A set of CSV files to source data from
optional arguments:
-h, --help show this help message and exit
--out OUTFILE SQLite db file to create
usage: db2jsonl.py [-h] [--out FOLDER] records.db
Prepare a SQLite database into Firestore-ready JSONL files
positional arguments:
records.db Path to the SQLite database generated by csv2db.py
optional arguments:
-h, --help show this help message and exit
--out FOLDER Folder to store .jsonl files in
usage: jsonl2firestore.py [-h] [--key KEY] [--meta META]
COSC 1430.jsonl [COSC 1430.jsonl ...]
Import formatted JSONL files into Google Firestore
positional arguments:
COSC 1430.jsonl A set of catalog JSONL files to source data from
optional arguments:
-h, --help show this help message and exit
--key KEY Path to Firebase Service account private key (see: README)
--meta META Path to catalog_meta/meta.json