v0.6.0
v0.6.0 modifies the gbsketch
strategy to reduce the number of calls to the NCBI REST API (necessary due to policy changes, but a better strategy regardless). Rather than downloading all files via the API, we download a single file via the API that contains direct fetch links to all requested accessions. We then download accessions in parallel -- currently limited to 30 simultaneously. It also enables streaming processing for both gbsketch
and urlsketch
, reducing memory requirements. Finally, it improves batch restart for --keep-fasta
by avoiding re-downloading and overwriting of completed downloads.
Note: gbsketch
now relies on gzip processing (internal crc32 checks) rather than md5sum checks to ensure we have complete downloads. urlsketch
will still check md5sums if they are provided.
What's Changed
- MRG: upd readme: installation, authors by @bluegenes in #200
- Update README.md by @bluegenes in #201
- add more api key docs by @bluegenes in #206
- MRG: fix
gbsketch
NCBI downloads by using dehydrate-rehydrate approach by @bluegenes in #222 - update some crates by @bluegenes in #229
- Bump openssl from 0.10.69 to 0.10.71 by @dependabot in #205
- MRG:
gbsketch
streaming processing by @bluegenes in #230 - upd zip to 2.6 by @bluegenes in #231
- MRG: streaming
urlsketch
processing by @bluegenes in #232 - MRG: Consolidate reused code across
gbsketch
/urlsketch
by @bluegenes in #233 - MRG: optionally avoid re-downloading existing FASTA when using
--keep-fasta
by @bluegenes in #235 - MRG: upd version to 0.6.0; pin rust to 1.74 by @bluegenes in #234
Full Changelog: v0.5.0...v0.6.0