You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# 9) get the depth of coverage for each readgroup, create a coverage mask and plots, and add failed variants to the coverage mask (artic_mask must be run before bcftools consensus)
Copy file name to clipboardExpand all lines: docs/minion.md
+11-2Lines changed: 11 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,8 @@ This page describes the core pipeline which is run via the `artic minion` comman
16
16
There are **2 workflows** baked into the core pipeline, one which uses signal data (via [nanopolish](https://github.com/jts/nanopolish)) and one that does not (via [medaka](https://github.com/nanoporetech/medaka)). As the workflows are identical in many ways, this page will describe the pipeline as whole and notify the reader when there is dfferent behaviour between the two workflows.
17
17
It should be noted here that by default the `nanopolish` workflow is selected; you need to specify `--medaka` (and `--medaka-model`) if you want the medaka workflow enabled.
18
18
19
+
> **NOTE**: It is very important that you select the appropriate value for `--medaka-model`.
20
+
19
21
At the end of each stage, we list here the "useful" stage output files which are kept. There will also be some additional files leftover at the end of the pipeline but these can be ignored (and are hopefully quite intuitively named).
20
22
21
23
## Stages
@@ -100,7 +102,7 @@ Finally, we use the `artic_vcf_filter` module to filter the merged variant file
100
102
101
103
### Consensus building
102
104
103
-
Prior to building a consensus, we use the post-processed alignment from the previous step to check each position of the reference sequence for sample coverage. Any poition that is not covered by at least 20 reads from either read group are marked as low coverage. We use the `artic_make_depth_mask` module for this, which produces coverage information for each read group and also produces a coverage mask to tell us which coordinates in the reference sequence failed the coverage threshold. We use `artic_plot_amplicon_depth` to take the read group depth data and plot amplicon coverage.
105
+
Prior to building a consensus, we use the post-processed alignment from the previous step to check each position of the reference sequence for sample coverage. Any poition that is not covered by at least 20 reads from either read group are marked as low coverage. We use the `artic_make_depth_mask` module for this, which produces coverage information for each read group and also produces a coverage mask to tell us which coordinates in the reference sequence failed the coverage threshold.
104
106
105
107
Next, to build a consensus sequence for a sample, we require a pre-consensus sequence based on the input reference sequence. The preconsensus has low quality sites masked out with `N`'s using the coverage mask and the `$SAMPLE.fail.vcf` file. We then use `bcftools consensus` to combine the preconsensus with the `$SAMPLE.pass.vcf` variants to produce a consensus sequence for the sample. The consensus sequence has the artic workflow written to its header.
106
108
@@ -122,6 +124,13 @@ Finally, the consensus sequence is aligned against the reference sequence using
122
124
| artic_vcf_merge | combines VCF files from multiple read groups |
123
125
| artic_vcf_filter | filters a combined VCF into PASS and FAIL variant files |
124
126
| artic_make_depth_mask | create a coverage mask from the post-processed alignment |
125
-
| artic_plot_amplicon_depth | plots per amplicon coverage |
126
127
| artic_mask | combines the reference sequence, FAIL variants and coverage mask to produce a pre-consensus sequence |
127
128
| artic_fasta_header | applies the artic workflow and identifier to the consensus sequence header |
129
+
130
+
## Optional pipeline report
131
+
132
+
As of version 1.2.0, you can run the artic fork of MultiQC (which should be installed as part of the artic conda environment) and this will produce a report containing amplicon coverage plots and variant call information. To generate a report from within your pipeline output directory:
0 commit comments