Bridge to connect Google Spreadsheets with the mx-elections-2021 API.
Read data, process it and populate an empty database
python pipeline.py <local|fb>
Compare data from GSheet and previously saved (dataset/person_old.csv
) and
update changes, additions and deletions on the API.
python updater.py
- Install dependencies with:
$ pip install -r requirements.txt
python >= 3.6
This pipeline have strong dependency with Google Spreadsheets where information, previously retrieve from multiple source by a team, is placed. There are two sheets:
Information about candidates. We consider the next fields:
person_id | role_type | first_name | last_name |
full_name | nickname | abbreviation | coalition |
state | area | membership_type | start_date |
end_date | is_substitute | is_titular | date_birth |
gender | dead_or_alive | last_degree_of_studies | profession_[1-6] |
Website | URL_FB_page | URL_FB_profile | URL_IG |
URL_TW | URL_others | URL_photo | source_of_truth |
Static tables and catalogs
- Area
- Chamber
- Role
- Coalition
- Party
- Contest
- Profession
- Url types
With this information we can populate an empty database using pipeline.py
module or update existing data using updater.py
module. Also, the module
check_predictions.py
compare Facebook Url predictions with capture sheet and
save the differences into predictions/
.
First we need a method to connect python
with the spreadsheets. With this
modules we manage spreadsheets manipulation and authentication respectively
NOTE: You need a Developer Google account to get an api token
(like credential.json
file) for auth.py
script.
This module read and store in memory static tables. This tables will be send to the API and also be required to build dynamic tables.
This module reads the capture and static sheets and make three operations:
- Runs a few
validations
(
validations.py
). The founded errors are saved intoerrors/
. The tested fields are the follow:- last_name
- membership_type
- dates
- urls
- professions
- Build dynamic tables
- Using static tables and catalogs
- Upload static and dynamic data to the API
- Additionally, the module saves a
csv
copy of database intocsv_db/
.
- Additionally, the module saves a
NOTE: This module require an empty database from the API since it reads all information from capture sheet and sned it.
This module compare current capture data (dataset/person_current.csv
) with
previous data (dataset/person_old.csv
). The difference are sent to the API
and logged into logs/
in three different types:
- Changes
- Additions
- Deletions
NOTE: Due to the time constraints of the project this module has a status of Work in Progress (WIP).
- Replace automatically person_old.csv with person_current.csv information once the data update is complete.
- Test the script
An example of how capture sheets looks like can be found at data_samples/