-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisit fetch from GISAID #242
Labels
enhancement
New feature or request
Comments
ivan-aksamentov
added a commit
that referenced
this issue
Dec 10, 2021
I don't know if it's any faster, but why now. The results are correct in my local testing. Locally, it does use multiple threads, but not too many. We might be bound by download speed rather then decompression though. Related: #242
I did not know it exists. Do you know the URL? Does it have the same data in it? In the meantime we could try parallel bzip also: #247 |
Ah, it does not exist, as far as we know. This would be asking GISAID to switch to xz for us for the current export we get. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Context
On Dec 2, 2021, multiple
fetch-and-ingest
runs for GISAID failed. The failure pattern was we would download for a while and the transfer would get closed before it's completed. Subsequent attempts to fetch would hit a 503 error. We manually triggeredfetch-and-ingest
two more times and saw the same failure pattern.Possible solution
The scheduled run today had no issues, so this may have just been unfortunate timing of our runs being interrupted by GISAID's reboots. We can revisit the following solutions in anticipation of similar future issues:
fetch-from-gisaid
to stop decompression during streaming to lower the open connection time. However, decompressing in a separate step this would increase the total time to runfetch-and-ingest
.xz
, which has better compression ratio and decompression time thanbzip2
. Regardless of errors, this would be a huge improvement for us and dramatically decreasefetch-and-ingest
runtime.The text was updated successfully, but these errors were encountered: