sbx_sga (Single Genome Assembly) is a sunbeam extension for isolate QC, assembly, and classification. This pipeline uses Mash for quality control, Shovill for bacterial isolate assembly, CheckM2 and QUAST for assembly QC, MLST for typing, Bakta for annotation, abriTAMR for AMR profiling, and Sylph for taxonomic classification.
- mash_ref: the reference file for running Mash (should be a file ending in
.msh) - checkm_ref: the diamond database for running CheckM2 (should be a file ending in
.dmnd) - bakta_ref: the bakta reference database (should be a directory similar to
.../bakta_db/db/) - genomad_ref: the genomad reference database (should be a directory containing many files including
version.txt(this is what our pipeline checks for to verify it exists)) - sylph_ref: the sylph reference database (should be a
.syldbfile) - snippy_ref: the snippy reference genome (should be a fasta with at least a decent quality genome)
conda create -n sga_dbs -c conda-forge -c bioconda mash bakta checkm2 genomad diamond prodigal
conda activate sga_dbsFor making smaller test databases, see .tests/data/README.md.
genomad download-database /path/to/db_storage/
More docs.
