-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: CLI, Docker, and More Download Types #56
base: master
Are you sure you want to change the base?
Conversation
The continuous integration run for the latest commit (49fe926) is at: https://github.com/ktmeaton/GISAIDR/actions/runs/10688769327 The |
Wow! What an amazing contribution @ktmeaton! I'm at a conference today but will review this asap. You're a legend 🎉 |
It's a lot of content, my battle with Nextflow got a little out of hand and I kept adding more 😪 but I hope there's a couple pieces here that you might also find useful! |
Hi @Wytamma,
I'm adapting your package for a Nextflow pipeline, with a priority focus on mpox. I implemented several new features that I wanted to propose to you! Broadly speaking, this pull request adds:
Continuous Integration for the latest commit 49fe926: https://github.com/ktmeaton/GISAIDR/actions/runs/10688769327
Issues Resolved
I think this pull request will resolve, or at least address, the following issues:
GISAIDR
to intgetrate nicely with Snakemake. Providing a Docker image is an option, although it doesn't help the user if they absolutely need to supportconda
as a runtime option.Changes
A command-line interface:
bin/GISAIDR
. See CLI Usage section.bin/GISAIDR.bat
orRscript bin/GISAIDR
.v0.9.10
.New R dependency
optparse
.New function
download_files
inR/download.R
.download
function to work on any of GISAID's provided data files including: Augur Input, Dates and Location, Patient Status, Sequencing Technology, and Sequences.download
function mostly intact, for backwards compatibility. (Just added slight logging change).New Download Types.
New functions
log.info
,log.error
,log.warn
inR/core.R
log.debug
function.log.info
also allows for different verbosity levels.--verbosity 2
is helpful for users who want extra logging output, without invoking the full--debug
logs of all the HTTP requests.New parameter
subtype
forR/query.R
subtype
("A", "B") for EpiRSV.Docker image via
Dockerfile
GISAIDR
R package and CLI.ktmeaton/gisaidr:cli
Tests
complete
andhigh_quality
work a bit different )download_files
function.Continous Integration
ubuntu-latest
as an operating system for the Build workflow.build
job totest
to reflect that the steps primarily test for errors.docker
. Builds the R package and CLI into a Docker image. If the branch ismaster
or starts with av
(ex.v0.10.0
), the CI job will also push the image to container registries. By default, this will just be the GitHub package registry (ex. https://github.com/ktmeaton/GISAIDR/pkgs/container/gisaidr). But if the repository has defined the secretsDOCKER_USERNAME
andDOCKER_PASSWORD
, it will also push to DockerHub (ex. https://hub.docker.com/r/ktmeaton/gisaidr/tags). Themaster
branch will update thelatest
image tag.CLI Usage
Setup
Create a
credentials.yml
file.Save some test accessions.
Local
Install the package to add the dependency
optparse
.Rscript -e "devtools::install('.')"
Preview usage.
Download EpiCoV test data based on accessions.
EpiCoV.sequences.fasta
andEpiCoV.metadata.tsv
which is a join of all the different tables (ex. Dates and Location, Patient Status, etc.).Docker
Preview usage.
docker run -v $(pwd):/tmp ktmeaton/gisaidr:cli GISAIDR --help
Download EpiPox test data based on a query.
docker run -v $(pwd):/tmp ktmeaton/gisaidr:cli GISAIDR \ --credentials credentials.yml \ --database EpiPox \ --prefix EpiPox \ --dates-and-location \ --location Canada \ --from-subm 2024-04-01 --to-subm 2024-06-01 \ --max-records 3
/tmp
directory./tmp
so thatGISAIDR
can access ourcredentials.yml
and write output metadata and sequences to it.EpiPox.metadata.tsv
. No sequences will be downloaded because--sequences
was not requested.Singularity
Pull image.
warn rootless
messages on pull. Comes from themicromamba
base image.Preview usage.
Download test data from
EpiRSV
./tmp
directory to match Docker./tmp
so thatGISAIDR
can access ourcredentials.yml
and write output metadata and sequences to it.Nextflow
If you want to use the CLI in
Nextflow
with asingularity
runtime, you need to do special handling for the executable command. Because themicromamba
base image integrates with singularity in an odd way.Temporary Changes
I have made the following temporary changes, that should be reverted before merging:
master
.cli
branch. Should be reverted to justmaster
and version tagged releasesv*
.R/internal_query.R
to make it easier to read. Might be a linting violation now though, I can restore the original formatting.