-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hospital Data Scraper Does Not Log Errors #1024
Comments
Thanks for flagging this @Mr0grog - I'll take a look at how we're logging in the other code and see how that can be integrated into the hospital script. |
The except Exception as error:
message = click.style(
'Hospitalization data fetch encountered error', fg='red'
)
click.echo(f'{message}: {error}', err=True)
traceback.print_exc()
sys.exit(1) Does something need to change in the GH action YAML? I see this in the Action for python scraper_data.py > today.json 2> errors.txt || true
# Merge new data into previous data. (Note this has to be two steps;
# if we read $OUT_PATH as input for jq and write stdout to it at the
# same time, we get conflicts and bad output.)
jq -s '.[0] + .[1]' $OUT_PATH today.json > merged.json
mv merged.json $OUT_PATH
ERRORS=`cat errors.txt`
if [ -n "${ERRORS}" ]; then
echo "Encountered the following errors while scraping:"
echo "------------------------------------------------"
echo "${ERRORS}"
# The error text can contain unescaped quotes, newlines, etc.
# Use jq to make sure we are composing correctly formatted JSON.
# `--raw-input` treats the input as strings instead of JSON.
# `--slurp` causes all lines to be combined into one string.
ERRORS_JSON=`cat errors.txt | jq --slurp --raw-input '{"text": .}'`
curl -X POST -H 'Content-type: application/json' --data "${ERRORS_JSON}" $SLACK_WEBHOOK_URL
# Raise an error so this step fails.
false
fi I'm pretty unfamiliar with GH actions but happy to try my hand at this. Looks like a bash script inside the YAML, and I can see how it integrates with Slack in the section above. Seems like the next step would be to put in a PR to adjust |
Yep, you've got it. I should have been clearer in the issue description. The other key bit is the part on the step that runs the scraper, which lets the job continue even if that step error'd out. Best way to test is to create a free Slack org and Webhook URL for yourself, alter the scraper to make it fail, and try that out. You can run it on a fork or test locally with act. All that is really high overhead, though, and I’m ok if we just review with eyeballs for this, even that may not be ideal. |
It turns out that the hospital scraper has been failing for a while (e.g. https://github.com/sfbrigade/stop-covid19-sfbayarea/runs/2107127471?check_suite_focus=true), but we didn't know because it's not logging failures to Slack like the news and data scrapers are. This needs fixing.
The text was updated successfully, but these errors were encountered: