Skip to content

Guidance regarding Snipe output #3

@NienkeMekkes

Description

@NienkeMekkes

Dear authors,

Thank you for creating this tool. I was testing this tool in the following way: I created a metagenomic dataset consisting of 13 different salmonella strains with an abundance of 0.001 each, and one E.coli strain at 0.987. All strain reference genomes were downloaded from NCBI. When profiling this dataset with Kraken2 and bracken, I find some false positive species entering the data, and some deviations in abundance, however, the output is pretty good.

For snipe, I added the accession numbers and strain designation to dict target and dict template when they were not present for the REC module. I used the default clostridium genome in filter.fna and I added the paths to the reference genomes as targets for the MAP module.

I get the following output. I added the columns genome and truth, for the used genome and known input abundance that I simulated. The initial best hit output is copied from the output of the ID module, which is pretty close to the truth. However, after running REC, I do not see an improved in the output, or I am misunderstanding the output or way to run Snipe. Could you provide guidance on how to use Snipe to reduce false positive hits? Or, should I change how I run Snipe, and add more target or filter genomes? Here I only used the genomes I know to be in my simulated sample as targets.

Genome truth Initial Best Hit Rectified Final Guess Final Guess Rectified Probability SSR Aligned Reads Rectified Abundance Initial Abundance Final Best Hit Final Best Hit Read Numbers
E.coli strain A 0.987 0.98406 0.00000 0.98717 0.00000 0.00000 0.00000 0.98718 0.98718 6581197.00000
Salmonella strain 1 0.001 0.00133 0.00000 0.00671 0.00000 0.00000 0.00000 0.00723 0.00723 48178.00000
Salmonella strain 2 0.001 0.00125 0.00000 0.00166 0.00000 0.00000 0.00000 0.00175 0.00175 11669.00000
Salmonella strain 3 0.001 0.00124 0.00004 0.00004 1.00000 6529.00000 0.00004 0.00004 0.00004 261.00000
Salmonella strain 4 0.001 0.00123 0.00000 0.00027 0.00000 0.00000 0.00000 0.00018 0.00018 1201.00000
Salmonella strain 5 0.001 0.00123 0.00000 0.00018 0.00000 0.00000 0.00000 0.00015 0.00015 999.00000
Salmonella strain 6 0.001 0.00120 0.00221 0.00221 1.00000 6529.00000 0.00180 0.00180 0.00180 12030.00000
Salmonella strain 7 0.001 0.00120 0.00001 0.00001 1.00000 6529.00000 0.00001 0.00001 0.00001 48.00000
Salmonella strain 8 0.001 0.00117 0.00000 0.00001 0.00000 0.00000 0.00000 0.00001 0.00001 82.00000
Salmonella strain 9 0.001 0.00117 0.00113 0.00113 1.00000 6529.00000 0.00115 0.00115 0.00115 7664.00000
Salmonella strain 10 0.001 0.00117 0.00042 0.00042 1.00000 6529.00000 0.00032 0.00032 0.00032 2104.00000
Salmonella strain 11 0.001 0.00116 0.00000 0.00000 1.00000 6529.00000 0.00000 0.00000 0.00000 24.00000
Salmonella strain 12 0.001 0.00115 0.00000 0.00002 0.00000 0.00000 0.00000 0.00002 0.00002 139.00000

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions