Reads-level based alignment to gene clusters of interest, e.g. bai operon or butyrate producing genes. Please refer to sunbeam_database for details. Make a diamond database from your proteins of interest fasta file and provide a text annotation file with the following columns: geneID, proteinID, ARO, taxon, weight.
- threads: Is the number of threads to run parallel processes with
- genes_fp: Is the path to the downloaded database (directory containing .fasta files)
- evalue:
- alnLen:
- mismatch:
Take UniRef50 database as an example. Download it and point sunbeam_config.yml
to it:
mkdir -p /path/to/uniref50/
wget ftp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref50/uniref50.fasta.gz -P /path/to/uniref50/
More docs.