-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profile ncov-ingest #240
Comments
Another aspect of profiling is the removal / storage of (large) files. This is applicable both for storage space while running, as well as the behavior of As of 45dcea6, currently we use snakemake's |
PR #231 has been merged and I've just started a run with this GitHub action. |
Using |
Adding as part of #240 to help collect more data for tackling #446. One unexpected behavior that I ran into when testing the `--stats` option is that Snakemake doesn't generate the stats file if the workflow exits with an error at any step. Note that the Snakemake `--stats` option is not available starting with Snakemake v8, so this will need to be removed when we eventually upgrade Snakemake in our runtimes.
Currently, GISAID ingest takes ~4 hours. We should profile the pipeline to figure out if improvements can be made without a major overhaul (i.e. incremental ingest with caches or a database).
benchmark
,--stats
, and/or--report
to get an overview of the pipeline. We should upload the outputs to S3 to have a record of changes over time.The text was updated successfully, but these errors were encountered: