Skip to content

Releases: umccr/cwl-ica

dragen-transcriptome-pipeline/4.2.4__20241210230924

Overview

MD5Sum: c66d8f1e95b11ae4c170c572607b8c15

Documentation

Documentation for dragen-transcriptome-pipeline v4.2.4

Dockstore

Dockstore Version Link

ICAv2

Tenant: umccr-prod

Bundles Generated

Bundle Name: dragen_transcriptome_pipeline_with_validation_data__4_2_4__20241210230924 / Bundle Version v9_r3__20241210230924

Description
This bundle has been generated by the release of workflows/dragen-transcriptome-pipeline/4.2.4/dragen-transcriptome-pipeline__4.2.4.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-transcriptome-pipeline/4.2.4__20241210230924.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v9_r3

Bundle ID: 55384505-279d-496b-80c8-8c24fd0e06e3

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: 73e21ce0-60d7-4e3e-b130-88aff78d500d
    Pipeline Code: dragen-transcriptome-pipeline__4_2_4__20241210230924

Projects

  • development
  • staging

Datasets

  • dragen_hash_table_hg38_alt_masked_v9_r3_linear_cnv_hla_rna
  • hg38_fasta
  • arriba_2_4_0
  • hg38_v39_gencode_annotation
  • wts_validation_fastq__SBJ00480
  • wts_validation_fastq__SBJ00028
  • wts_validation_fastq__SBJ00061
  • wts_validation_fastq__SBJ00188
  • wts_validation_fastq__SBJ00199
  • wts_validation_fastq__SBJ00236
  • wts_validation_fastq__SBJ00238
  • wts_multiqc__2023_07_21__4_2_4__Ref_1_Good__SBJ01563
  • wts_multiqc__2023_07_21__4_2_4__Ref_2_Good__SBJ01147
  • wts_multiqc__2023_07_21__4_2_4__Ref_3_Good__SBJ01620
  • wts_multiqc__2023_07_21__4_2_4__Ref_4_Bad__SBJ01286
  • wts_multiqc__2023_07_21__4_2_4__Ref_5_Bad__SBJ01673

Bundle Name: dragen_transcriptome_pipeline_prod__4_2_4__20241210230924 / Bundle Version v9_r3__20241210230924

Description
This bundle has been generated by the release of workflows/dragen-transcriptome-pipeline/4.2.4/dragen-transcriptome-pipeline__4.2.4.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-transcriptome-pipeline/4.2.4__20241210230924.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v9_r3

Bundle ID: 24f3ae95-6a38-4864-89e0-5d71174089c1

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: 73e21ce0-60d7-4e3e-b130-88aff78d500d
    Pipeline Code: dragen-transcriptome-pipeline__4_2_4__20241210230924

Projects

  • production

Datasets

  • dragen_hash_table_hg38_alt_masked_v9_r3_linear_cnv_hla_rna
  • hg38_fasta
  • arriba_2_4_0
  • hg38_v39_gencode_annotation
  • wts_multiqc__2023_07_21__4_2_4__Ref_1_Good__SBJ01563
  • wts_multiqc__2023_07_21__4_2_4__Ref_2_Good__SBJ01147
  • wts_multiqc__2023_07_21__4_2_4__Ref_3_Good__SBJ01620
  • wts_multiqc__2023_07_21__4_2_4__Ref_4_Bad__SBJ01286
  • wts_multiqc__2023_07_21__4_2_4__Ref_5_Bad__SBJ01673

Visual Overview

Click to expand!

dragen-transcriptome-pipeline

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/dragen-transcriptome-pipeline%2F4.2.4__20241210230924/dragen-transcriptome-pipeline__4.2.4__20241210230924.schema.json

# algorithm (Optional)
# Default value: proportional
# Docs: Counting algorithm:
# uniquely-mapped-reads(default) or proportional.
algorithm: "proportional"

# annotation file (Required)
# Docs: Path to annotation transcript file.
annotation_file:
  class: File
  location: icav2://project_id/path/to/file

# bam input (Optional)
# Docs: Input a BAM file for WTS analysis
bam_input:
  class: File
  location: icav2://project_id/path/to/file

# blacklist (Required)
# Docs: File with blacklist range
blacklist:
  class: File
  location: icav2://project_id/path/to/file

# cl config (Optional)
# Docs: command line config to supply additional config values on the command line.
cl_config: string

# contigs (Optional)
# Docs: Optional - List of interesting contigs
# If not specified, defaults to 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y
contigs: string

# cytobands (Required)
# Docs: Coordinates of the Giemsa staining bands.
cytobands:
  class: File
  location: icav2://project_id/path/to/file

# enable duplicate marking (Required)
# Docs: Mark identical alignments as duplicates
enable_duplicate_marking: false

# enable map align (Optional)
# Docs: Enabled by default.
# Set this value to false if using bam_input AND tumor_bam_input
enable_map_align: false

# enable map align output (Required)
# Docs: Do you wish to have the output bam files present
enable_map_align_output: false

# enable rna gene fusion (Optional)
# Docs: Optional - Enable the DRAGEN Gene Fusion module - defaults to true
enable_rna_gene_fusion: false

# enable rna quantification (Optional)
# Docs: Optional - Enable the quantification module - defaults to true
enable_rna_quantification: false

# enable sort (Optional)
# Docs: True by default, only set this to false if using --bam-input as input parameter
enable_sort: false

# fastq list (Optional)
# Docs: CSV file that contains a list of FASTQ files
# to process. read_1 and read_2 components in the CSV file must be presigned urls.
fastq_list:
  class: File
  location: icav2://project_id/path/to/file

# Row of fastq lists (Optional)
# Docs: The row of fastq lists.
# Each row has the following attributes:
#   * RGID
#   * RGLB
#   * RGSM
#   * Lane
#   * Read1File
#   * Read2File (optional)
fastq_list_rows:
- rgid: string
  rglb: string
  rgsm: string
  lane: string
  read_1:
    class: File
    location: icav2://project_id/path/to/file
  read_2:
    class: File
    location: icav2://project_id/path/to/file

# java mem (Optional)
# Default value: 20G
# Docs: Set desired Java heap memory size
java_mem: "20G"

# license instance id location (Optional)
# Docs: You may wish to place your own in.
# Optional value, default set to /opt/instance-identity
# which is a path inside the dragen container
lic_instance_id_location:
  class: File
  location: icav2://project_id/path/to/file

# ora reference tar (Optional)
# Docs: Path to ref data tarball
ora_reference_tar:
  class: File
  location: icav2://project_id/path/to/file

# output file prefix (Required)
# Docs: The prefix given to all output files
output_prefix: string

# protein domains (Required)
# Docs: GFF3 file containing the genomic coordinates of protein domains.
protein_domains:
  class: File
  location: icav2://project_id/path/to/file

# qc reference samples (Required)
# Docs: Reference samples for multiQC report
qc_reference_samples:
- class: Directory
  location: icav2://project_id/path/to/dir/

# read trimming (Optional)
# Docs: To enable trimming filters in hard-trimming mode, set to a comma-separated list of the trimmer tools 
# you would like to use. To disable trimming, set to none. During mapping, artifacts are removed from all reads.
# Read trimming is disabled by default.
read_trimmers: string

# reference Fasta (Required)
# Docs: FastA file with genome sequence
reference_fasta:
  class: File
  location: icav2://project_id/path/to/file

# reference tar (Required)
# Docs: Path to ref data tarball
reference_tar:
  class: File
  location: icav2://project_id/path/to/file

# soft read trimming (Optional)
# Docs: To enable trimming filters in soft-trimming mode, set to a comma-separated list of the trimmer tools 
# you would like to use. To disable soft trimming, set to none. During mapping, reads are aligned as if trimmed,
# and bases are not removed from the reads. Soft-trimming is enabled for the polyg filter by default.
soft_read_trimmers: string

# trim adapter r1 5prime (Optional)
# Docs: Specify the FASTA file that contains adapter sequences to trim from the 5' end of Read 1. 
# NB: the sequences should be in reverse order (with respect to their appearance in the FASTQ) but not complemented.
trim_adapter_r1_5prime:
  class: File
  location: icav2://project_id/path/to/file

# trim adapter read1 (Optional)
# Docs: Specify the FASTA file that contains adapter sequences to trim from the 3' end of Read 1.
trim_adapter_read1:
  class: File
  location: icav2://project_id/path/to/file

# trim adapter read2 (Optional)
# Docs: Specify the FASTA file that contains adapter sequences to trim from the 3' end of Read 2.
trim_adapter_read2:
  class: File
  location: icav2://project_id/path/to/file

# trim adapter stringency (Optional)
# Docs: Specify the minimum number of adapter bases required for trimming
trim_adapter_stringency: string

# trim adapter r2 5prime (Optional)
# Docs: Specify the FASTA file that contains adapter sequences to trim from the 5' end of Read 2.
# NB: the sequences should be in reverse order (with respect to their appearance in the FASTQ) but not complemented.
trim_dapter_r2_5prime:
  class: File
  location: icav2://project_id/path/to/file

# trim r1 3prime (Optional)
# Docs: Specify the minimum number of bases to trim from the 3' end of Read 1 (default: 0).
trim_r1_3prime: string

# trim r1 5prime (Optional)
# Docs: Specify the minimum number of bases to trim from the 5' end of Read 1 (default: 0).
trim_r1_5prime: string

# trim r2 3prime (Optional)
# Docs: Specify the minimum n...
Read more

dragen-somatic-with-germline-pipeline/4.2.4__20241210230846

Overview

MD5Sum: 2b7d1a489676ccf9588a3d5fff37cc4b

Documentation

Documentation for dragen-somatic-with-germline-pipeline
v4.2.4

Dockstore

Dockstore Version Link

ICAv2

Tenant: umccr-prod

Bundles Generated

Bundle Name: dragen_somatic_with_germline_pipeline_with_validation_data__4_2_4__20241210230846 / Bundle Version v9_r3__20241210230846

Description
This bundle has been generated by the release of workflows/dragen-somatic-with-germline-pipeline/4.2.4/dragen-somatic-with-germline-pipeline__4.2.4.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-somatic-with-germline-pipeline/4.2.4__20241210230846.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v9_r3

Bundle ID: 67470de3-79ef-498b-b1c3-1c1863a5cfb2

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: 7d3cb608-80e0-4ecf-a67e-ef524e9bfb8b
    Pipeline Code: dragen-somatic-with-germline-pipeline__4_2_4__20241210230846

Projects

  • development
  • staging

Datasets

  • dragen_hash_table_hg38_alt_masked_v9_r3_linear_cnv_hla_rna
  • wgs_validation_fastq__cups_pair_8
  • wgs_validation_fastq__2016_249_17_MH_P033
  • wgs_validation_fastq__2016_249_18_WH_P025
  • wgs_validation_fastq__B_ALL_Case_10
  • wgs_validation_fastq_Diploid_Never_Responder
  • wgs_validation_fastq_SBJ00303
  • wgs_validation_fastq_SEQC50
  • wgs_validation_fastq_SFRC01073
  • ora_reference_v2

Bundle Name: dragen_somatic_with_germline_pipeline_prod__4_2_4__20241210230846 / Bundle Version v9_r3__20241210230846

Description
This bundle has been generated by the release of workflows/dragen-somatic-with-germline-pipeline/4.2.4/dragen-somatic-with-germline-pipeline__4.2.4.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-somatic-with-germline-pipeline/4.2.4__20241210230846.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v9_r3

Bundle ID: 7fee3ddc-c5de-48d1-b9d3-c3506fe548a2

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: 7d3cb608-80e0-4ecf-a67e-ef524e9bfb8b
    Pipeline Code: dragen-somatic-with-germline-pipeline__4_2_4__20241210230846

Projects

  • production

Datasets

  • dragen_hash_table_hg38_alt_masked_v9_r3_linear_cnv_hla_rna
  • ora_reference_v2

Visual Overview

Click to expand!

dragen-somatic-with-germline-pipeline

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/dragen-somatic-with-germline-pipeline%2F4.2.4__20241210230846/dragen-somatic-with-germline-pipeline__4.2.4__20241210230846.schema.json

# bam input (Optional)
# Docs: Input a normal BAM file for the variant calling stage
bam_input:
  class: File
  location: icav2://project_id/path/to/file

# cnv enable self normalization (Optional)
# Docs: Enable CNV self normalization.
# Self Normalization requires that the DRAGEN hash table be generated with the enable-cnv=true option.
cnv_enable_self_normalization: false

# cnv normal b allele vcf (Optional)
# Docs: Specify a matched normal SNV VCF.
cnv_normal_b_allele_vcf:
  class: File
  location: icav2://project_id/path/to/file

# cnv normal cnv vcf (Optional)
# Docs: Specify germline CNVs from the matched normal sample.
cnv_normal_cnv_vcf: false

# cnv population b allele vcf (Optional)
# Docs: Specify a population SNP catalog.
cnv_population_b_allele_vcf:
  class: File
  location: icav2://project_id/path/to/file

# cnv somatic enable het calling (Optional)
# Docs: Enable HET-calling mode for heterogeneous segments.
cnv_somatic_enable_het_calling: false

# cnv somatic enable lower ploidy limit (Optional)
# Docs: To improve accuracy on the tumor ploidy model estimation, the somatic WGS CNV caller estimates whether the chosen model calls 
# homozygous deletions on regions that are likely to reduce the overall fitness of cells, 
# which are therefore deemed to be "essential" and under negative selection. 
# In the current literature, recent efforts tried to map such cell-essential genes (eg, in 2015 - https://www.science.org/doi/10.1126/science.aac7041).
# The check on essential regions is controlled with --cnv-somatic-enable-lower-ploidy-limit (default true).
cnv_somatic_enable_lower_ploidy_limit: false

# cnv somatic essential genes bed (Optional)
# Docs: Default bedfiles describing the essential regions are provided for hg19, GRCh37, hs37d5, GRCh38, 
# but a custom bedfile can also be provided in input through the 
# --cnv-somatic-essential-genes-bed=<BEDFILE_PATH> parameter. 
# In such case, the feature is automatically enabled. 
# A custom essential regions bedfile needs to have the following format: 4-column, tab-separated, 
# where the first 3 columns identify the coordinates of the essential region (chromosome, 0-based start, excluded end). 
# The fourth column is the region id (string type). For the purpose of the algorithm, currently only the first 3 columns are used. 
# However, the fourth might be helpful to investigate manually which regions drove the decisions on model plausibility made by the caller.
cnv_somatic_essential_genes_bed: string

# cnv use somatic vc baf (Optional)
# Docs: If running in tumor-normal mode with the SNV caller enabled, use this option
# to specify the germline heterozygous sites.
cnv_use_somatic_vc_baf: false

# cnv use somatic vc vaf (Optional)
# Docs: Use the variant allele frequencies (VAFs) from the somatic SNVs to help select
# the tumor model for the sample.
cnv_use_somatic_vc_vaf: false

# cram input (Optional)
# Docs: Input a normal CRAM file for the variant calling stage
cram_input:
  class: File
  location: icav2://project_id/path/to/file

# cram reference (Optional)
# Docs: Path to the reference fasta file for the CRAM input. 
# Required only if the input is a cram file AND not the reference in the tarball
cram_reference:
  class: File
  location: icav2://project_id/path/to/file

# dbsnp annotation (Optional)
# Docs: In Germline, Tumor-Normal somatic, or Tumor-Only somatic modes,
# DRAGEN can look up variant calls in a dbSNP database and add annotations for any matches that it finds there.
# To enable the dbSNP database search, set the --dbsnp option to the full path to the dbSNP database
# VCF or .vcf.gz file, which must be sorted in reference order.
dbsnp_annotation:
  class: File
  location: icav2://project_id/path/to/file

# deduplicate minimum quality (Optional)
# Docs: Specifies the Phred quality score below which a base should be excluded from the quality score
# calculation used for choosing among duplicate reads.
dedup_min_qual: string

# deduplicate minimum quality germline (Optional)
# Docs: Specifies the Phred quality score below which a base should be excluded from the quality score
# calculation used for choosing among duplicate reads.
dedup_min_qual_germline: string

# deduplicate minimum quality somatic (Optional)
# Docs: Specifies the Phred quality score below which a base should be excluded from the quality score
# calculation used for choosing among duplicate reads.
dedup_min_qual_somatic: string

# enable cnv calling (Optional)
# Docs: Enable CNV processing in the DRAGEN Host Software.
enable_cnv: false

# enable cnv germline (Optional)
# Docs: Enable CNV processing in the DRAGEN Host Software (somatic only)
enable_cnv_germline: false

# enable cnv somatic (Optional)
# Docs: Enable CNV processing in the DRAGEN Host Software (germline only)
enable_cnv_somatic: false

# enable duplicate marking (Optional)
# Docs: Enable the flagging of duplicate output
# alignment records.
enable_duplicate_marking: false

# enable duplicate marking germline (Optional)
# Docs: Enable the flagging of duplicate output
# alignment records.
enable_duplicate_marking_germline: false

# enable duplicate marking somatic (Optional)
# Docs: Enable the flagging of duplicate output
# alignment records.
enable_duplicate_marking_somatic: false

# enable hla (Optional)
# Docs: Enable HLA typing by setting --enable-hla flag to true
enable_hla: false

# enable hrd (Optional)
# Docs: Set to true to enable HRD scoring to quantify genomic instability.
# Requires somatic CNV calls.
enable_hrd: false

# enable map align (Optional)
# Docs: Enabled by default since --enable-variant-caller option is set to true.
# Set this value to false if using bam_input
enable_map_align: false

# enable map align germline (Optional)
# Docs: Enabled by default since --enable-variant-caller option is set to true.
# Set this value to false if using bam_input
enable_map_align_germline: false

# enable map align output (Optional)
# Docs: Enables saving the output from the
# map/align stage. Default is true when only
# running map/align. Default is false if
# running the variant caller.
enable_map_align_output: false

# enable map align output germline (Optional)
# Docs: Enables saving the output from the
# map/align stage. Default is true when only
# running map/align. Default is false if
# running the variant caller.
enable_map_align_output_germline: false

#...
Read more

dragen-build-reference-tarball-pipeline/4.2.4__20241210072731

Overview

MD5Sum: b7149af1a4ee0d28d7bc7cde8affe9b6

Documentation

Documentation for dragen-build-reference-tarball v4.2.4

Dockstore

Dockstore Version Link

ICAv2

Tenant: umccr-prod

Bundles Generated

Bundle Name: dragen_reference_tarball_generation__20241210072731 / Bundle Version 4.2.4__20241210072731

Description
This bundle has been generated by the release of workflows/dragen-build-reference-tarball-pipeline/4.2.4/dragen-build-reference-tarball-pipeline__4.2.4.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-build-reference-tarball-pipeline/4.2.4__20241210072731.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is 4.2.4

Bundle ID: 0f66c371-97cd-4ca6-acfb-33d184ef2acd

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: 5b00a816-3adc-443d-a7a1-e2e1f038c1e8
    Pipeline Code: dragen-build-reference-tarball-pipeline__4_2_4__20241210072731

Projects

  • development

Datasets

  • hg38_fasta

Visual Overview

Click to expand!

dragen-build-reference-tarball-pipeline

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/dragen-build-reference-tarball-pipeline%2F4.2.4__20241210072731/dragen-build-reference-tarball-pipeline__4.2.4__20241210072731.schema.json

# enable cnv (Optional)
# Docs: For the DRAGEN CNV pipeline, the hashtable must be generated with the --enable-cnv option set to true,
# in addition to any other options required by other pipelines.
# When --enable-cnv is true, dragen generates an additional k-mer uniqueness map that the CNV algorithm uses to
# counteract mapability biases.
# The k-mer uniqueness map file only needs to be generated once per reference hashtable
# and takes about 1.5 hours per whole human genome.
enable_cnv: false

# ht alt aware validation (Optional)
# Docs: When building hash tables from a reference that contains ALT-contigs,
# building with a liftover file is required.
# To disable this requirement, set the --ht-alt-aware-validate option to false.
ht_alt_aware_validate: false

# ht alt aware liftover (Optional)
# Docs: The --ht-alt-liftover option specifies the path to the liftover file to build an ALT-aware hash table.
# This option is required when building from a reference with ALT contigs.
# SAM liftover files for hg38DH and hg19 are provided in /opt/edico/liftover.

# For hg38 references, use bwa-kit_hs38DH_liftover.sam
# For hg19 references, use hg19_alt_liftover.sam
ht_alt_liftover:
  class: File
  location: icav2://project_id/path/to/file

# ht build hla hash table (Optional)
# Default value: True
# Docs: Used when --enable-hla is set to true for any given dragen workflow.
# This option must be used when running dragen workflows on hla data.
ht_build_hla_hashtable: true

# ht build rna hash table (Optional)
# Default value: True
# Docs: Used when --enable-rna is set to true for any given dragen workflow.
# This option must be used when running dragen workflows on rna data.
ht_build_rna_hashtable: true

# cost coefficient for hit frequency (Optional)
# Docs: The --ht-cost-coeff-seed-freq option assigns the cost component for the difference between
# the target hit frequency and the number of hits populated for a single seed.
# Higher values result primarily in high-frequency seeds being extended further to bring their frequencies down
# toward the target.
ht_cost_coeff_seed_freq: string

# cost coefficient for seed length (Optional)
# Docs: The --ht-cost-coeff-seed-len option assigns the cost component for each base by which a seed is extended.
# Additional bases are considered a cost because longer seeds risk overlapping variants or sequencing errors and
# losing their correct mappings. Higher values lead to shorter final seed extensions.
ht_cost_coeff_seed_len: string

# cost penalty for seed extension (Optional)
# Docs: The --ht-cost-penalty option assigns a flat cost for extending beyond the primary seed length.
# A higher value results in fewer seeds being extended at all.
# Current testing shows that zero (0) is appropriate for this parameter.
ht_cost_penalty: string

# cost increment for extension step (Optional)
# Docs: The --ht-cost-penalty-incr option assigns a recurring cost for each incremental seed extension step
# taken from primary to final extended seed length.
# More steps are considered a higher cost because extending in many small steps requires
# more hash table space for intermediate EXTEND records,
# and takes substantially more run time to execute the extensions.
# A higher value results in seed extension trees with fewer nodes,
# reaching from the root primary seed length to leaf extended seed lengths in fewer, larger steps.
ht_cost_penalty_incr: string

# ht decoys path (Optional)
# Docs: The DRAGEN software automatically detects the use of hg19 and hg38 references and
# adds decoys to the hash table when they are not found in the FASTA file.
# Use the --ht-decoys option to specify the path to a decoys file.
# The default is /opt/edico/liftover/hs_decoys.fa.
ht_decoys: string

# ht mask bed (Optional)
# Docs: Specifies the BED file for base masking.
ht_mask_bed:
  class: File
  location: icav2://project_id/path/to/file

# ht max dec factor (Optional)
# Docs: Seed thinning is an experimental technique to improve mapping performance in high-frequency regions.
# When primary seeds have higher frequency than the cap indicated by the --ht-soft-seed-freq-cap option,
# only a fraction of seed positions are populated to stay under the cap.

# The --ht-max-dec-factor option specifies a maximum factor by which seeds can be thinned.

# For example, --ht-max-dec-factor 3 retains at least 1/3 of the original seeds. --ht-max-dec-factor 1
# disables any thinning.

# Seeds are decimated in careful patterns to prevent leaving any long gaps unpopulated.

# The idea is that seed thinning can achieve mapped seed coverage in high frequency reference regions
# where the maximum hit frequency would otherwise have been exceeded.

# Seed thinning can also keep seed extensions shorter, which is also good for successful mapping.
# Based on testing to date, seed thinning has not proven to be superior to other accuracy optimization methods.
ht_max_dec_factor: string

# ht maximum seed length (Optional)
# Docs: The --ht-max-ext-seed-len option limits the length of extended seeds populated into the hash table.
# Primary seeds (length specified by --ht-seed-len) that match many reference positions can be extended
# to achieve more unique matching, which may be required to map seeds within the maximum hit frequency
# (--ht-max-seed-freq).
# Given a primary seed length k, the maximum seed length can be configured between k and k+128.
# The default is the upper bound, k+128.
ht_max_ext_seed_len: string

# ht maximum hit frequency (Optional)
# Docs: The --ht-max-seed-freq option sets a firm limit on the number of seed hits (reference genome locations)
# that can be populated for any primary or extended seed.

# If a given primary seed maps to more reference positions than this limit,
# it must be extended long enough that the extended seeds subdivide into smaller groups of identical
# seeds under the limit. If, even at the maximum extended seed length (--ht-max-ext-seed-len),
# a group of identical reference seeds is larger than this limit,
# their reference positions are not populated into the hash table.
# Instead, dragen populates a single High Frequency record.
# The maximum hit frequency can be configured from 1 to 256.
# However, if this value is too low, hash table construction can fail because too many seed extensions are needed.
# The practical minimum for a whole human genome reference, other options being default, is 8.
ht_max_seed_freq: string

# ht max table chunks (Optional)
# Docs: The --ht-max-table-chunks option controls the memory footprint during hash table construction by
# limiting the number of ~1 GB hash table chunks that reside in memory simultaneously.
# Each additional chunk consumes roughly twice its size (~2 GB) in system memory during construction.

# The hash table is divided into power-of-two independent chunks, of a fixed chunk size, X,
# which depends on the hash table size, in the range 0.5 GB < X ≤ 1 GB.

# For example, a 24 GB hash table contains 32 independent 0.75 GB chunks that can be constructed by parallel
# threads with enough memory and a 16 GB hash table contains 16 independent 1 GB chunks.

# The default is --ht-max-table-chunks equal to --ht-num-threads,
# but with a minimum default --ht-max-table-chunks of 8.

# It makes sense to have these two options match, because building one hash table chunk requires one chunk space
# in memory and one thread to work on it. Nevertheless, there are build-speed advantages to
# raising --ht-max-table-chunks higher than --ht-num-threads, or to raising --ht-num-threads higher
# than --ht-max-table-chunks.

# For example, the DRAGEN servers contain 24 cores that have hyperthreading enabled,
# so a value of 32 should be used. When using a higher value, adjust --ht-max-table-chunks needs to be adjusted
# as well. The servers have 128 GB of memory available.
ht_max_table_chunks: string

# ht mem limit (Opt...
Read more

dragen-build-reference-tarball-pipeline/4.2.4__20241210071831

Overview

MD5Sum: b7149af1a4ee0d28d7bc7cde8affe9b6

Documentation

Documentation for dragen-build-reference-tarball v4.2.4

Dockstore

Dockstore Version Link

Visual Overview

Click to expand!

dragen-build-reference-tarball-pipeline

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/dragen-build-reference-tarball-pipeline%2F4.2.4__20241210071831/dragen-build-reference-tarball-pipeline__4.2.4__20241210071831.schema.json

# enable cnv (Optional)
# Docs: For the DRAGEN CNV pipeline, the hashtable must be generated with the --enable-cnv option set to true,
# in addition to any other options required by other pipelines.
# When --enable-cnv is true, dragen generates an additional k-mer uniqueness map that the CNV algorithm uses to
# counteract mapability biases.
# The k-mer uniqueness map file only needs to be generated once per reference hashtable
# and takes about 1.5 hours per whole human genome.
enable_cnv: false

# ht alt aware validation (Optional)
# Docs: When building hash tables from a reference that contains ALT-contigs,
# building with a liftover file is required.
# To disable this requirement, set the --ht-alt-aware-validate option to false.
ht_alt_aware_validate: false

# ht alt aware liftover (Optional)
# Docs: The --ht-alt-liftover option specifies the path to the liftover file to build an ALT-aware hash table.
# This option is required when building from a reference with ALT contigs.
# SAM liftover files for hg38DH and hg19 are provided in /opt/edico/liftover.

# For hg38 references, use bwa-kit_hs38DH_liftover.sam
# For hg19 references, use hg19_alt_liftover.sam
ht_alt_liftover:
  class: File
  location: icav2://project_id/path/to/file

# ht build hla hash table (Optional)
# Default value: True
# Docs: Used when --enable-hla is set to true for any given dragen workflow.
# This option must be used when running dragen workflows on hla data.
ht_build_hla_hashtable: true

# ht build rna hash table (Optional)
# Default value: True
# Docs: Used when --enable-rna is set to true for any given dragen workflow.
# This option must be used when running dragen workflows on rna data.
ht_build_rna_hashtable: true

# cost coefficient for hit frequency (Optional)
# Docs: The --ht-cost-coeff-seed-freq option assigns the cost component for the difference between
# the target hit frequency and the number of hits populated for a single seed.
# Higher values result primarily in high-frequency seeds being extended further to bring their frequencies down
# toward the target.
ht_cost_coeff_seed_freq: string

# cost coefficient for seed length (Optional)
# Docs: The --ht-cost-coeff-seed-len option assigns the cost component for each base by which a seed is extended.
# Additional bases are considered a cost because longer seeds risk overlapping variants or sequencing errors and
# losing their correct mappings. Higher values lead to shorter final seed extensions.
ht_cost_coeff_seed_len: string

# cost penalty for seed extension (Optional)
# Docs: The --ht-cost-penalty option assigns a flat cost for extending beyond the primary seed length.
# A higher value results in fewer seeds being extended at all.
# Current testing shows that zero (0) is appropriate for this parameter.
ht_cost_penalty: string

# cost increment for extension step (Optional)
# Docs: The --ht-cost-penalty-incr option assigns a recurring cost for each incremental seed extension step
# taken from primary to final extended seed length.
# More steps are considered a higher cost because extending in many small steps requires
# more hash table space for intermediate EXTEND records,
# and takes substantially more run time to execute the extensions.
# A higher value results in seed extension trees with fewer nodes,
# reaching from the root primary seed length to leaf extended seed lengths in fewer, larger steps.
ht_cost_penalty_incr: string

# ht decoys path (Optional)
# Docs: The DRAGEN software automatically detects the use of hg19 and hg38 references and
# adds decoys to the hash table when they are not found in the FASTA file.
# Use the --ht-decoys option to specify the path to a decoys file.
# The default is /opt/edico/liftover/hs_decoys.fa.
ht_decoys: string

# ht mask bed (Optional)
# Docs: Specifies the BED file for base masking.
ht_mask_bed:
  class: File
  location: icav2://project_id/path/to/file

# ht max dec factor (Optional)
# Docs: Seed thinning is an experimental technique to improve mapping performance in high-frequency regions.
# When primary seeds have higher frequency than the cap indicated by the --ht-soft-seed-freq-cap option,
# only a fraction of seed positions are populated to stay under the cap.

# The --ht-max-dec-factor option specifies a maximum factor by which seeds can be thinned.

# For example, --ht-max-dec-factor 3 retains at least 1/3 of the original seeds. --ht-max-dec-factor 1
# disables any thinning.

# Seeds are decimated in careful patterns to prevent leaving any long gaps unpopulated.

# The idea is that seed thinning can achieve mapped seed coverage in high frequency reference regions
# where the maximum hit frequency would otherwise have been exceeded.

# Seed thinning can also keep seed extensions shorter, which is also good for successful mapping.
# Based on testing to date, seed thinning has not proven to be superior to other accuracy optimization methods.
ht_max_dec_factor: string

# ht maximum seed length (Optional)
# Docs: The --ht-max-ext-seed-len option limits the length of extended seeds populated into the hash table.
# Primary seeds (length specified by --ht-seed-len) that match many reference positions can be extended
# to achieve more unique matching, which may be required to map seeds within the maximum hit frequency
# (--ht-max-seed-freq).
# Given a primary seed length k, the maximum seed length can be configured between k and k+128.
# The default is the upper bound, k+128.
ht_max_ext_seed_len: string

# ht maximum hit frequency (Optional)
# Docs: The --ht-max-seed-freq option sets a firm limit on the number of seed hits (reference genome locations)
# that can be populated for any primary or extended seed.

# If a given primary seed maps to more reference positions than this limit,
# it must be extended long enough that the extended seeds subdivide into smaller groups of identical
# seeds under the limit. If, even at the maximum extended seed length (--ht-max-ext-seed-len),
# a group of identical reference seeds is larger than this limit,
# their reference positions are not populated into the hash table.
# Instead, dragen populates a single High Frequency record.
# The maximum hit frequency can be configured from 1 to 256.
# However, if this value is too low, hash table construction can fail because too many seed extensions are needed.
# The practical minimum for a whole human genome reference, other options being default, is 8.
ht_max_seed_freq: string

# ht max table chunks (Optional)
# Docs: The --ht-max-table-chunks option controls the memory footprint during hash table construction by
# limiting the number of ~1 GB hash table chunks that reside in memory simultaneously.
# Each additional chunk consumes roughly twice its size (~2 GB) in system memory during construction.

# The hash table is divided into power-of-two independent chunks, of a fixed chunk size, X,
# which depends on the hash table size, in the range 0.5 GB < X ≤ 1 GB.

# For example, a 24 GB hash table contains 32 independent 0.75 GB chunks that can be constructed by parallel
# threads with enough memory and a 16 GB hash table contains 16 independent 1 GB chunks.

# The default is --ht-max-table-chunks equal to --ht-num-threads,
# but with a minimum default --ht-max-table-chunks of 8.

# It makes sense to have these two options match, because building one hash table chunk requires one chunk space
# in memory and one thread to work on it. Nevertheless, there are build-speed advantages to
# raising --ht-max-table-chunks higher than --ht-num-threads, or to raising --ht-num-threads higher
# than --ht-max-table-chunks.

# For example, the DRAGEN servers contain 24 cores that have hyperthreading enabled,
# so a value of 32 should be used. When using a higher value, adjust --ht-max-table-chunks needs to be adjusted
# as well. The servers have 128 GB of memory available.
ht_max_table_chunks: string

# ht mem limit (Optional)
# Docs: The --ht-mem-limit option controls the generated hash table size by specifying the DRAGEN board memory available
# for both the hash table and the encoded reference genome.
# The ‑‑ht‑mem-limit option defaults to 32 GB when the reference genome approaches WHG size,
# or to a generous size for smaller references. Normally there is little reason to override these defaults.
ht_mem_limit: string

# ht methylated (Optional)
# Docs: DRAGEN methylation runs require building a special pair of hash tables with reference bases
# converted from C->T for one table, and G->A for the other.
# When running the hash table generation with the --ht-methylated option, these conversions are done automatically,
# and the converted hash tables are generated in a pair of subdirectories of the target directory
# specified with --output-directory.
# The subdirectories are named CT_converted and GA_converted, corresponding to the automatic base conversions.
# When using these hash tables for methylated alignment runs, refer to the original --output-directory and not
# to either of the automatically gener...
Read more

dragen-instrument-run-fastq-to-ora-pipeline/4.2.4__20241122070556

Overview

MD5Sum: 597d249e2e4c99a0d3846b27bcd7fea5

Documentation

This tool can be used for archiving purposes by first compressing fastqs prior to transfer to a long-term storage location.

Dockstore

Dockstore Version Link

ICAv2

Tenant: umccr-prod

Bundles Generated

Bundle Name: ora_instrument_run_compression_pipeline_with_reference__4_2_4__20241122070556 / Bundle Version v2__20241122070556

Description
This bundle has been generated by the release of workflows/dragen-instrument-run-fastq-to-ora-pipeline/4.2.4/dragen-instrument-run-fastq-to-ora-pipeline__4.2.4.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-instrument-run-fastq-to-ora-pipeline/4.2.4__20241122070556.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v2

Bundle ID: 43ad35e4-6017-44e6-b7cf-4a9f059d0ae6

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: 0540fca4-cc40-45ac-88e2-d32df69c6954
    Pipeline Code: dragen-instrument-run-fastq-to-ora-pipeline__4_2_4__20241122070556

Projects

  • development
  • staging
  • production

Datasets

  • ora_reference_v2

Visual Overview

Click to expand!

dragen-instrument-run-fastq-to-ora-pipeline

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/dragen-instrument-run-fastq-to-ora-pipeline%2F4.2.4__20241122070556/dragen-instrument-run-fastq-to-ora-pipeline__4.2.4__20241122070556.schema.json

# instrument run directory (Required)
# Docs: The directory containing the instrument run. Expected to be in the BCLConvert 4.2.7 output format, with the following structure:
#   Reports/
#   InterOp/
#   Logs/
#   Samples/
#   Samples/Lane_1/
#   Samples/Lane_1/Sample_ID/
#   Samples/Lane_1/Sample_ID/Sample_ID_S1_L001_R1_001.fastq.gz
#   Samples/Lane_1/Sample_ID/Sample_ID_S1_L001_R2_001.fastq.gz
#   etc...
instrument_run_directory:
  class: Directory
  location: icav2://project_id/path/to/dir/

# ora check file integrity (Optional)
# Default value: False
# Docs: Set to true to perform and output result of FASTQ file and decompressed FASTQ.ORA integrity check. The default value is false.
ora_check_file_integrity: false

# ora parallel files (Optional)
# Default value: 2
# Docs: The number of files to compress in parallel. If using an FPGA medium instance in the 
# run_dragen_instrument_run_fastq_to_ora_step this should be set to 16 / ora_threads_per_file.
ora_parallel_files: 2

# ora print file info (Optional)
# Default value: False
# Docs: Prints file information summary of ORA compressed files.
ora_print_file_info: false

# ora reference (Required)
# Docs: The reference tar to use for the ORA compression
ora_reference:
  class: File
  location: icav2://project_id/path/to/file

# ora threads per file (Optional)
# Default value: 8
# Docs: The number of threads to use per file. If using an FPGA medium instance in the 
# run_dragen_instrument_run_fastq_to_ora_step this should be set to 4 since there are only 16 cores available
ora_threads_per_file: 8

# sample id list (Optional)
# Docs: Optional list of samples to process.  
# Samples NOT in this list are NOT compressed AND NOT transferred to the final output directory!
sample_id_list:
- string

Json

Click to expand!
{
    "instrument_run_directory": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    },
    "ora_check_file_integrity": false,
    "ora_parallel_files": 2,
    "ora_print_file_info": false,
    "ora_reference": {
        "class": "File",
        "location": "icav2://project_id/path/to/file"
    },
    "ora_threads_per_file": 8,
    "sample_id_list": [
        "string"
    ]
}

Outputs Template

Click to expand!
{
    "output_directory": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    }
}

Overrides Template

Zipped workflow

Click to expand!
[
    "workflow.cwl#dragen-instrument-run-fastq-to-ora-pipeline--4.2.4/run_dragen_instrument_run_fastq_to_ora_step"
]

Packed workflow

Click to expand!
[
    "#main/run_dragen_instrument_run_fastq_to_ora_step"
]

Inputs

Click to expand!

instrument run directory

ID: instrument_run_directory

Optional: False
Type: Directory
Docs:
The directory containing the instrument run. Expected to be in the BCLConvert 4.2.7 output format, with the following structure:
Reports/
InterOp/
Logs/
Samples/
Samples/Lane_1/
Samples/Lane_1/Sample_ID/
Samples/Lane_1/Sample_ID/Sample_ID_S1_L001_R1_001.fastq.gz
Samples/Lane_1/Sample_ID/Sample_ID_S1_L001_R2_001.fastq.gz
etc...

ora check file integrity

ID: ora_check_file_integrity

Optional: False
Type: boolean
Docs:
Set to true to perform and output result of FASTQ file and decompressed FASTQ.ORA integrity check. The default value is false.

ora parallel files

ID: ora_parallel_files

Optional: True
Type: int
Docs:
The number of files to compress in parallel. If using an FPGA medium instance in the
run_dragen_instrument_run_fastq_to_ora_step this should be set to 16 / ora_threads_per_file.

ora print file info

ID: ora_print_file_info

Optional: False
Type: boolean
Docs:
Prints file information summary of ORA compressed files.

ora reference

ID: ora_reference

Optional: False
Type: File
Docs:
The reference tar to use for the ORA compression

ora threads per file

ID: ora_threads_per_file

Optional: True
Type: int
Docs:
The number of threads to use per file. If using an FPGA medium instance in the
run_dragen_instrument_run_fastq_to_ora_step this should be set to 4 since there are only 16 cores available

sample id list

ID: sample_id_list

Optional: True
Type: .[]
Docs:
Optional list of samples to process.
Samples NOT in this list are NOT compressed AND NOT transferred to the final output directory!

Steps

Click to expand!

Run Dragen Instrument Run Fastq to ORA

ID: dragen-instrument-run-fastq-to-ora-pipeline--4.2.4/run_dragen_instrument_run_fastq_to_ora_step

Step Type: tool
Docs:

Run the dragen instrument run fastq to ora tool

Outputs

Click to expand!

output directory

ID: dragen-instrument-run-fastq-to-ora-pipeline--4.2.4/output_directory

Optional: False
Output Type: Directory
Docs:
The output directory of the instrument run with fastqs converted to oras

dragen-instrument-run-fastq-to-ora-pipeline/4.2.4__20241120224050

Overview

MD5Sum: d292da07e5425d9879ba869ab58ff316

Documentation

This tool can be used for archiving purposes by first compressing fastqs prior to transfer to a long-term storage location.

Dockstore

Dockstore Version Link

ICAv2

Tenant: umccr-prod

Bundles Generated

Bundle Name: ora_instrument_run_compression_pipeline_with_reference__4_2_4__20241120224050 / Bundle Version v2__20241120224050

Description
This bundle has been generated by the release of workflows/dragen-instrument-run-fastq-to-ora-pipeline/4.2.4/dragen-instrument-run-fastq-to-ora-pipeline__4.2.4.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-instrument-run-fastq-to-ora-pipeline/4.2.4__20241120224050.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v2

Bundle ID: 49663293-6664-479e-82ce-7e8b7067499a

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: 5c1c2fa2-30dc-46ed-9e7f-dc4fefac77b6
    Pipeline Code: dragen-instrument-run-fastq-to-ora-pipeline__4_2_4__20241120224050

Projects

  • development
  • staging
  • production

Datasets

  • ora_reference_v2

Visual Overview

Click to expand!

dragen-instrument-run-fastq-to-ora-pipeline

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/dragen-instrument-run-fastq-to-ora-pipeline%2F4.2.4__20241120224050/dragen-instrument-run-fastq-to-ora-pipeline__4.2.4__20241120224050.schema.json

# instrument run directory (Required)
# Docs: The directory containing the instrument run. Expected to be in the BCLConvert 4.2.7 output format, with the following structure:
#   Reports/
#   InterOp/
#   Logs/
#   Samples/
#   Samples/Lane_1/
#   Samples/Lane_1/Sample_ID/
#   Samples/Lane_1/Sample_ID/Sample_ID_S1_L001_R1_001.fastq.gz
#   Samples/Lane_1/Sample_ID/Sample_ID_S1_L001_R2_001.fastq.gz
#   etc...
instrument_run_directory:
  class: Directory
  location: icav2://project_id/path/to/dir/

# ora check file integrity (Optional)
# Default value: False
# Docs: Set to true to perform and output result of FASTQ file and decompressed FASTQ.ORA integrity check. The default value is false.
ora_check_file_integrity: false

# ora parallel files (Optional)
# Default value: 2
# Docs: The number of files to compress in parallel. If using an FPGA medium instance in the 
# run_dragen_instrument_run_fastq_to_ora_step this should be set to 16 / ora_threads_per_file.
ora_parallel_files: 2

# ora print file info (Optional)
# Default value: False
# Docs: Prints file information summary of ORA compressed files.
ora_print_file_info: false

# ora reference (Required)
# Docs: The reference tar to use for the ORA compression
ora_reference:
  class: File
  location: icav2://project_id/path/to/file

# ora threads per file (Optional)
# Default value: 8
# Docs: The number of threads to use per file. If using an FPGA medium instance in the 
# run_dragen_instrument_run_fastq_to_ora_step this should be set to 4 since there are only 16 cores available
ora_threads_per_file: 8

# sample id list (Optional)
# Docs: Optional list of samples to process.  
# Samples NOT in this list are NOT compressed AND NOT transferred to the final output directory!
sample_id_list:
- string

Json

Click to expand!
{
    "instrument_run_directory": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    },
    "ora_check_file_integrity": false,
    "ora_parallel_files": 2,
    "ora_print_file_info": false,
    "ora_reference": {
        "class": "File",
        "location": "icav2://project_id/path/to/file"
    },
    "ora_threads_per_file": 8,
    "sample_id_list": [
        "string"
    ]
}

Outputs Template

Click to expand!
{
    "output_directory": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    }
}

Overrides Template

Zipped workflow

Click to expand!
[
    "workflow.cwl#dragen-instrument-run-fastq-to-ora-pipeline--4.2.4/run_dragen_instrument_run_fastq_to_ora_step"
]

Packed workflow

Click to expand!
[
    "#main/run_dragen_instrument_run_fastq_to_ora_step"
]

Inputs

Click to expand!

instrument run directory

ID: instrument_run_directory

Optional: False
Type: Directory
Docs:
The directory containing the instrument run. Expected to be in the BCLConvert 4.2.7 output format, with the following structure:
Reports/
InterOp/
Logs/
Samples/
Samples/Lane_1/
Samples/Lane_1/Sample_ID/
Samples/Lane_1/Sample_ID/Sample_ID_S1_L001_R1_001.fastq.gz
Samples/Lane_1/Sample_ID/Sample_ID_S1_L001_R2_001.fastq.gz
etc...

ora check file integrity

ID: ora_check_file_integrity

Optional: False
Type: boolean
Docs:
Set to true to perform and output result of FASTQ file and decompressed FASTQ.ORA integrity check. The default value is false.

ora parallel files

ID: ora_parallel_files

Optional: True
Type: int
Docs:
The number of files to compress in parallel. If using an FPGA medium instance in the
run_dragen_instrument_run_fastq_to_ora_step this should be set to 16 / ora_threads_per_file.

ora print file info

ID: ora_print_file_info

Optional: False
Type: boolean
Docs:
Prints file information summary of ORA compressed files.

ora reference

ID: ora_reference

Optional: False
Type: File
Docs:
The reference tar to use for the ORA compression

ora threads per file

ID: ora_threads_per_file

Optional: True
Type: int
Docs:
The number of threads to use per file. If using an FPGA medium instance in the
run_dragen_instrument_run_fastq_to_ora_step this should be set to 4 since there are only 16 cores available

sample id list

ID: sample_id_list

Optional: True
Type: .[]
Docs:
Optional list of samples to process.
Samples NOT in this list are NOT compressed AND NOT transferred to the final output directory!

Steps

Click to expand!

Run Dragen Instrument Run Fastq to ORA

ID: dragen-instrument-run-fastq-to-ora-pipeline--4.2.4/run_dragen_instrument_run_fastq_to_ora_step

Step Type: tool
Docs:

Run the dragen instrument run fastq to ora tool

Outputs

Click to expand!

output directory

ID: dragen-instrument-run-fastq-to-ora-pipeline--4.2.4/output_directory

Optional: False
Output Type: Directory
Docs:
The output directory of the instrument run with fastqs converted to oras

bclconvert-interop-qc/1.3.1--1.25.2__20241120234352

Overview

MD5Sum: bf3fba24832235c0203ea4f72c2a4d53

Documentation

Documentation for bclconvert-interop-qc v1.3.1--1.25.2
This workflow has been designed for BCLConvert 4.2.7 outputs from the Nextflow autolaunch pipeline.
The InterOp directory is expected to contain the IndexMetricsOut.bin file, otherwise the
index summary will not be generated.
It is assumed that the Reports directory will contain the RunInfo.xml file

Dockstore

Dockstore Version Link

ICAv2

Tenant: umccr-prod

Bundles Generated

Bundle Name: bclconvert_interop_qc_prod__1_3_1__1_25_2__20241120234352 / Bundle Version 1.3.1__1.25.2__20241120234352

Description
This bundle has been generated by the release of workflows/bclconvert-interop-qc/1.3.1--1.25.2/bclconvert-interop-qc__1.3.1--1.25.2.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/bclconvert-interop-qc/1.3.1--1.25.2__20241120234352.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is 1.3.1__1.25.2

Bundle ID: cbc8246b-325f-4d2d-b0cd-3a5e1167bc4a

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: dfbf508b-fa4d-47d5-8de5-da63a20c894b
    Pipeline Code: bclconvert-interop-qc__1_3_1--1_25_2__20241120234352

Projects

  • development
  • staging
  • production

Visual Overview

Click to expand!

bclconvert-interop-qc

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/bclconvert-interop-qc%2F1.3.1--1.25.2__20241120234352/bclconvert-interop-qc__1.3.1--1.25.2__20241120234352.schema.json

# BCLConvert Report Directory (Required)
# Docs: The output directory from a BCLConvert run named 'Reports'
bclconvert_report_directory:
  class: Directory
  location: icav2://project_id/path/to/dir/

# Instrument Run ID (Required)
# Docs: The instrument run ID
instrument_run_id: string

# Interop Directory (Required)
# Docs: The interop directory
interop_directory:
  class: Directory
  location: icav2://project_id/path/to/dir/

Json

Click to expand!
{
    "bclconvert_report_directory": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    },
    "instrument_run_id": "string",
    "interop_directory": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    }
}

Outputs Template

Click to expand!
{
    "interop_output_dir": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    },
    "multiqc_html_report": {
        "class": "File",
        "location": "icav2://project_id/path/to/file"
    },
    "multiqc_output_dir": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    }
}

Overrides Template

Zipped workflow

Click to expand!
[
    "workflow.cwl#bclconvert-interop-qc--1.3.1--1.25.2/generate_interop_qc_step",
    "workflow.cwl#bclconvert-interop-qc--1.3.1--1.25.2/run_multiqc_step"
]

Packed workflow

Click to expand!
[
    "#main/generate_interop_qc_step",
    "#main/run_multiqc_step"
]

Inputs

Click to expand!

BCLConvert Report Directory

ID: bclconvert_report_directory

Optional: False
Type: Directory
Docs:
The output directory from a BCLConvert run named 'Reports'

Instrument Run ID

ID: instrument_run_id

Optional: False
Type: string
Docs:
The instrument run ID

Interop Directory

ID: interop_directory

Optional: False
Type: Directory
Docs:
The interop directory

Steps

Click to expand!

Generate InterOp QC

ID: bclconvert-interop-qc--1.3.1--1.25.2/generate_interop_qc_step

Step Type: tool
Docs:

Generate the interop files by mounting the interop directory underneath a directory named by the run id specified.
along with the run info xml file.

Get RunInfo.xml file from Reports Dir

ID: bclconvert-interop-qc--1.3.1--1.25.2/get_run_info_xml_file_from_reports_dir

Step Type: expression
Docs:

Get the RunInfo.xml file from the Reports Directory

Run Multiqc

ID: bclconvert-interop-qc--1.3.1--1.25.2/run_multiqc_step

Step Type: tool
Docs:

Run MultiQC on the input reports directory along with the generated index summary files

Outputs

Click to expand!

interop out dir

ID: bclconvert-interop-qc--1.3.1--1.25.2/interop_output_dir

Optional: False
Output Type: Directory
Docs:
Directory containing the inteop summary csvs

multiqc html report

ID: bclconvert-interop-qc--1.3.1--1.25.2/multiqc_html_report

Optional: False
Output Type: File
Docs:
The HTML report generated by the multiqc step

multiqc output dir

ID: bclconvert-interop-qc--1.3.1--1.25.2/multiqc_output_dir

Optional: False
Output Type: Directory
Docs:
Directory containing the multiqc data

bclconvert-interop-qc/1.3.1--1.21__20241119001529

Overview

MD5Sum: dfad0c0195611d1ff0ca6a255955fc00

Documentation

Documentation for bclconvert-interop-qc v1.3.1--1.21
This workflow has been designed for BCLConvert 4.2.7 outputs from the Nextflow autolaunch pipeline.
The InterOp directory is expected to contain the IndexMetricsOut.bin file, otherwise the
index summary will not be generated.
It is assumed that the Reports directory will contain the RunInfo.xml file

Dockstore

Dockstore Version Link

ICAv2

Tenant: umccr-prod

Bundles Generated

Bundle Name: bclconvert_interop_qc_prod__1_3_1__1_21__20241119001529 / Bundle Version 1.3.1__1.21__20241119001529

Description
This bundle has been generated by the release of workflows/bclconvert-interop-qc/1.3.1--1.21/bclconvert-interop-qc__1.3.1--1.21.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/bclconvert-interop-qc/1.3.1--1.21__20241119001529.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is 1.3.1__1.21

Bundle ID: 91b751c9-c19c-46ad-bcf4-2f97439feee6

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: a147ad9f-af8f-409d-95b7-49018782ab4d
    Pipeline Code: bclconvert-interop-qc__1_3_1--1_21__20241119001529

Projects

  • development
  • staging
  • production

Visual Overview

Click to expand!

bclconvert-interop-qc

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/bclconvert-interop-qc%2F1.3.1--1.21__20241119001529/bclconvert-interop-qc__1.3.1--1.21__20241119001529.schema.json

# BCLConvert Report Directory (Required)
# Docs: The output directory from a BCLConvert run named 'Reports'
bclconvert_report_directory:
  class: Directory
  location: icav2://project_id/path/to/dir/

# Instrument Run ID (Required)
# Docs: The instrument run ID
instrument_run_id: string

# Interop Directory (Required)
# Docs: The interop directory
interop_directory:
  class: Directory
  location: icav2://project_id/path/to/dir/

Json

Click to expand!
{
    "bclconvert_report_directory": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    },
    "instrument_run_id": "string",
    "interop_directory": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    }
}

Outputs Template

Click to expand!
{
    "interop_output_dir": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    },
    "multiqc_html_report": {
        "class": "File",
        "location": "icav2://project_id/path/to/file"
    },
    "multiqc_output_dir": {
        "class": "Directory",
        "location": "icav2://project_id/path/to/dir/"
    }
}

Overrides Template

Zipped workflow

Click to expand!
[
    "workflow.cwl#bclconvert-interop-qc--1.3.1--1.21/generate_interop_qc_step",
    "workflow.cwl#bclconvert-interop-qc--1.3.1--1.21/run_multiqc_step"
]

Packed workflow

Click to expand!
[
    "#main/generate_interop_qc_step",
    "#main/run_multiqc_step"
]

Inputs

Click to expand!

BCLConvert Report Directory

ID: bclconvert_report_directory

Optional: False
Type: Directory
Docs:
The output directory from a BCLConvert run named 'Reports'

Instrument Run ID

ID: instrument_run_id

Optional: False
Type: string
Docs:
The instrument run ID

Interop Directory

ID: interop_directory

Optional: False
Type: Directory
Docs:
The interop directory

Steps

Click to expand!

Generate InterOp QC

ID: bclconvert-interop-qc--1.3.1--1.21/generate_interop_qc_step

Step Type: tool
Docs:

Generate the interop files by mounting the interop directory underneath a directory named by the run id specified.
along with the run info xml file.

Get RunInfo.xml file from Reports Dir

ID: bclconvert-interop-qc--1.3.1--1.21/get_run_info_xml_file_from_reports_dir

Step Type: expression
Docs:

Get the RunInfo.xml file from the Reports Directory

Run Multiqc

ID: bclconvert-interop-qc--1.3.1--1.21/run_multiqc_step

Step Type: tool
Docs:

Run MultiQC on the input reports directory along with the generated index summary files

Outputs

Click to expand!

interop out dir

ID: bclconvert-interop-qc--1.3.1--1.21/interop_output_dir

Optional: False
Output Type: Directory
Docs:
Directory containing the inteop summary csvs

multiqc html report

ID: bclconvert-interop-qc--1.3.1--1.21/multiqc_html_report

Optional: False
Output Type: File
Docs:
The HTML report generated by the multiqc step

multiqc output dir

ID: bclconvert-interop-qc--1.3.1--1.21/multiqc_output_dir

Optional: False
Output Type: Directory
Docs:
Directory containing the multiqc data

dragen-somatic-with-germline-pipeline/4.3.6__20241115073817

Overview

MD5Sum: 34be195a66bd929ae3ba18b0c7ec10a8

Documentation

Documentation for dragen-somatic-with-germline-pipeline
v4.3.6

Dockstore

Dockstore Version Link

ICAv2

Tenant: umccr-prod

Bundles Generated

Bundle Name: dragen_somatic_with_germline_pipeline_with_validation_data__4_3_6__20241115073817 / Bundle Version v10_r4__20241115073817

Description
This bundle has been generated by the release of workflows/dragen-somatic-with-germline-pipeline/4.3.6/dragen-somatic-with-germline-pipeline__4.3.6.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-somatic-with-germline-pipeline/4.3.6__20241115073817.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v10_r4

Bundle ID: aef12d8c-055e-4a5c-a949-7200d296e3aa

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: adeb223a-0a84-43bc-a13a-9bf0ed31e565
    Pipeline Code: dragen-somatic-with-germline-pipeline__4_3_6__20241115073817

Projects

  • development
  • staging

Datasets

  • dragen_hash_table_chm13_v2_v10_r4_graph_cnv_hla_rna
  • dragen_hash_table_chm13_v2_v10_r4_linear_cnv_hla_rna_methylated_combined
  • dragen_hash_table_hg38_alt_masked_v10_r4_graph_cnv_hla_rna
  • dragen_hash_table_hg38_alt_masked_v10_r4_linear_cnv_hla_rna_methylated_combined
  • wgs_validation_fastq__cups_pair_8
  • wgs_validation_fastq__2016_249_17_MH_P033
  • wgs_validation_fastq__2016_249_18_WH_P025
  • wgs_validation_fastq__B_ALL_Case_10
  • wgs_validation_fastq_Diploid_Never_Responder
  • wgs_validation_fastq_SBJ00303
  • wgs_validation_fastq_SEQC50
  • wgs_validation_fastq_SFRC01073

Bundle Name: dragen_somatic_with_germline_pipeline_prod__4_3_6__20241115073817 / Bundle Version v10_r4__20241115073817

Description
This bundle has been generated by the release of workflows/dragen-somatic-with-germline-pipeline/4.3.6/dragen-somatic-with-germline-pipeline__4.3.6.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-somatic-with-germline-pipeline/4.3.6__20241115073817.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v10_r4

Bundle ID: dee75735-a97c-4572-b51e-04776a4fdc36

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: adeb223a-0a84-43bc-a13a-9bf0ed31e565
    Pipeline Code: dragen-somatic-with-germline-pipeline__4_3_6__20241115073817

Projects

  • production

Datasets

  • dragen_hash_table_chm13_v2_v10_r4_graph_cnv_hla_rna
  • dragen_hash_table_chm13_v2_v10_r4_linear_cnv_hla_rna_methylated_combined
  • dragen_hash_table_hg38_alt_masked_v10_r4_graph_cnv_hla_rna
  • dragen_hash_table_hg38_alt_masked_v10_r4_linear_cnv_hla_rna_methylated_combined

Visual Overview

Click to expand!

dragen-somatic-with-germline-pipeline

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/dragen-somatic-with-germline-pipeline%2F4.3.6__20241115073817/dragen-somatic-with-germline-pipeline__4.3.6__20241115073817.schema.json

# bam input (Optional)
# Docs: Input a normal BAM file for the variant calling stage
bam_input:
  class: File
  location: icav2://project_id/path/to/file

# cnv enable self normalization (Optional)
# Docs: Enable CNV self normalization.
# Self Normalization requires that the DRAGEN hash table be generated with the enable-cnv=true option.
cnv_enable_self_normalization: false

# cnv normal b allele vcf (Optional)
# Docs: Specify a matched normal SNV VCF.
cnv_normal_b_allele_vcf:
  class: File
  location: icav2://project_id/path/to/file

# cnv normal cnv vcf (Optional)
# Docs: Specify germline CNVs from the matched normal sample.
cnv_normal_cnv_vcf: false

# cnv population b allele vcf (Optional)
# Docs: Specify a population SNP catalog.
cnv_population_b_allele_vcf:
  class: File
  location: icav2://project_id/path/to/file

# cnv somatic enable het calling (Optional)
# Docs: Enable HET-calling mode for heterogeneous segments.
cnv_somatic_enable_het_calling: false

# cnv somatic enable lower ploidy limit (Optional)
# Docs: To improve accuracy on the tumor ploidy model estimation, the somatic WGS CNV caller estimates whether the chosen model calls
# homozygous deletions on regions that are likely to reduce the overall fitness of cells,
# which are therefore deemed to be "essential" and under negative selection.
# In the current literature, recent efforts tried to map such cell-essential genes (eg, in 2015 - https://www.science.org/doi/10.1126/science.aac7041).
# The check on essential regions is controlled with --cnv-somatic-enable-lower-ploidy-limit (default true).
cnv_somatic_enable_lower_ploidy_limit: false

# cnv somatic essential genes bed (Optional)
# Docs: Default bedfiles describing the essential regions are provided for hg19, GRCh37, hs37d5, GRCh38,
# but a custom bedfile can also be provided in input through the
# --cnv-somatic-essential-genes-bed=<BEDFILE_PATH> parameter.
# In such case, the feature is automatically enabled.
# A custom essential regions bedfile needs to have the following format: 4-column, tab-separated,
# where the first 3 columns identify the coordinates of the essential region (chromosome, 0-based start, excluded end).
# The fourth column is the region id (string type). For the purpose of the algorithm, currently only the first 3 columns are used.
# However, the fourth might be helpful to investigate manually which regions drove the decisions on model plausibility made by the caller.
cnv_somatic_essential_genes_bed: string

# cnv use somatic vc baf (Optional)
# Docs: If running in tumor-normal mode with the SNV caller enabled, use this option
# to specify the germline heterozygous sites.
cnv_use_somatic_vc_baf: false

# cnv use somatic vc vaf (Optional)
# Docs: Use the variant allele frequencies (VAFs) from the somatic SNVs to help select
# the tumor model for the sample.
cnv_use_somatic_vc_vaf: false

# cram input (Optional)
# Docs: Input a normal CRAM file for the variant calling stage
cram_input:
  class: File
  location: icav2://project_id/path/to/file

# cram reference (Optional)
# Docs: Path to the reference fasta file for the CRAM input.
# Required only if the input is a cram file AND not the reference in the tarball
cram_reference:
  class: File
  location: icav2://project_id/path/to/file

# dbsnp annotation (Optional)
# Docs: In Germline, Tumor-Normal somatic, or Tumor-Only somatic modes,
# DRAGEN can look up variant calls in a dbSNP database and add annotations for any matches that it finds there.
# To enable the dbSNP database search, set the --dbsnp option to the full path to the dbSNP database
# VCF or .vcf.gz file, which must be sorted in reference order.
dbsnp_annotation:
  class: File
  location: icav2://project_id/path/to/file

# deduplicate minimum quality (Optional)
# Docs: Specifies the Phred quality score below which a base should be excluded from the quality score
# calculation used for choosing among duplicate reads.
dedup_min_qual: string

# deduplicate minimum quality germline (Optional)
# Docs: Specifies the Phred quality score below which a base should be excluded from the quality score
# calculation used for choosing among duplicate reads.
dedup_min_qual_germline: string

# deduplicate minimum quality somatic (Optional)
# Docs: Specifies the Phred quality score below which a base should be excluded from the quality score
# calculation used for choosing among duplicate reads.
dedup_min_qual_somatic: string

# enable cnv calling (Optional)
# Docs: Enable CNV processing in the DRAGEN Host Software.
enable_cnv: false

# enable cnv germline (Optional)
# Docs: Enable CNV processing in the DRAGEN Host Software (somatic only)
enable_cnv_germline: false

# enable cnv somatic (Optional)
# Docs: Enable CNV processing in the DRAGEN Host Software (germline only)
enable_cnv_somatic: false

# enable duplicate marking (Optional)
# Docs: Enable the flagging of duplicate output
# alignment records.
enable_duplicate_marking: false

# enable duplicate marking germline (Optional)
# Docs: Enable the flagging of duplicate output
# alignment records.
enable_duplicate_marking_germline: false

# enable duplicate marking somatic (Optional)
# Docs: Enable the flagging of duplicate output
# alignment records.
enable_duplicate_marking_somatic: false

# enable hla (Optional)
# Docs: Enable HLA typing by setting --enable-hla flag to true
enable_hla: false

# enable hrd (Optional)
# Docs: Set to true to enable HRD scoring to quantify genomic instability.
# Requires somatic CNV calls.
enable_hrd: false

# enable map align (Optional)
# Docs: Enabled by default since --enable-variant-caller option is set to true.
# Set this value to false if using bam_input
enable_map_align: false

# enable map align germline (Optional)
# Docs: Enabled by default since --enable-variant-caller option is set to true.
# Set this value to false if using bam_input
enable_map_align_germline: false

# enable map align output (Optional)
# Docs: Enables saving the output from the
# map/align stage....
Read more

dragen-somatic-with-germline-pipeline/4.3.6__20241115045341

Overview

MD5Sum: 76caa14272dd68d3a994738b73dcb7d7

Documentation

Documentation for dragen-somatic-with-germline-pipeline
v4.3.6

Dockstore

Dockstore Version Link

ICAv2

Tenant: umccr-prod

Bundles Generated

Bundle Name: dragen_somatic_with_germline_pipeline_with_validation_data__4_3_6__20241115045341 / Bundle Version v10_r4__20241115045341

Description
This bundle has been generated by the release of workflows/dragen-somatic-with-germline-pipeline/4.3.6/dragen-somatic-with-germline-pipeline__4.3.6.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-somatic-with-germline-pipeline/4.3.6__20241115045341.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v10_r4

Bundle ID: 8a354985-78c6-4fc8-90ed-00b92dde5091

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: 3c5ea2c7-dedf-4f3e-92ee-ca66a619ad39
    Pipeline Code: dragen-somatic-with-germline-pipeline__4_3_6__20241115045341

Projects

  • development
  • staging

Datasets

  • dragen_hash_table_chm13_v2_v10_r4_graph_cnv_hla_rna
  • dragen_hash_table_chm13_v2_v10_r4_linear_cnv_hla_rna_methylated_combined
  • dragen_hash_table_hg38_alt_masked_v10_r4_graph_cnv_hla_rna
  • dragen_hash_table_hg38_alt_masked_v10_r4_linear_cnv_hla_rna_methylated_combined
  • wgs_validation_fastq__cups_pair_8
  • wgs_validation_fastq__2016_249_17_MH_P033
  • wgs_validation_fastq__2016_249_18_WH_P025
  • wgs_validation_fastq__B_ALL_Case_10
  • wgs_validation_fastq_Diploid_Never_Responder
  • wgs_validation_fastq_SBJ00303
  • wgs_validation_fastq_SEQC50
  • wgs_validation_fastq_SFRC01073

Bundle Name: dragen_somatic_with_germline_pipeline_prod__4_3_6__20241115045341 / Bundle Version v10_r4__20241115045341

Description
This bundle has been generated by the release of workflows/dragen-somatic-with-germline-pipeline/4.3.6/dragen-somatic-with-germline-pipeline__4.3.6.cwl. The pipeline can be found at https://github.com/umccr/cwl-ica/releases/tag/dragen-somatic-with-germline-pipeline/4.3.6__20241115045341.

Version Description
Bundle version description is currently redundant while we cannot append versions to bundles. Regardless - the bunch version is v10_r4

Bundle ID: ceb34a28-344c-4fd7-808b-881468e91ded

  • Bundle Link
    Pipeline Project ID: 5844391a-69db-4b52-86b5-6a0d55c2386f
    Pipeline Project Name: pipelines
    Pipeline ID: 3c5ea2c7-dedf-4f3e-92ee-ca66a619ad39
    Pipeline Code: dragen-somatic-with-germline-pipeline__4_3_6__20241115045341

Projects

  • production

Datasets

  • dragen_hash_table_chm13_v2_v10_r4_graph_cnv_hla_rna
  • dragen_hash_table_chm13_v2_v10_r4_linear_cnv_hla_rna_methylated_combined
  • dragen_hash_table_hg38_alt_masked_v10_r4_graph_cnv_hla_rna
  • dragen_hash_table_hg38_alt_masked_v10_r4_linear_cnv_hla_rna_methylated_combined

Visual Overview

Click to expand!

dragen-somatic-with-germline-pipeline

Inputs Template

Yaml

Click to expand!
# yaml-language-server: $schema=https://github.com/umccr/cwl-ica/releases/download/dragen-somatic-with-germline-pipeline%2F4.3.6__20241115045341/dragen-somatic-with-germline-pipeline__4.3.6__20241115045341.schema.json

# bam input (Optional)
# Docs: Input a normal BAM file for the variant calling stage
bam_input:
  class: File
  location: icav2://project_id/path/to/file

# cnv enable self normalization (Optional)
# Docs: Enable CNV self normalization.
# Self Normalization requires that the DRAGEN hash table be generated with the enable-cnv=true option.
cnv_enable_self_normalization: false

# cnv normal b allele vcf (Optional)
# Docs: Specify a matched normal SNV VCF.
cnv_normal_b_allele_vcf:
  class: File
  location: icav2://project_id/path/to/file

# cnv normal cnv vcf (Optional)
# Docs: Specify germline CNVs from the matched normal sample.
cnv_normal_cnv_vcf: false

# cnv population b allele vcf (Optional)
# Docs: Specify a population SNP catalog.
cnv_population_b_allele_vcf:
  class: File
  location: icav2://project_id/path/to/file

# cnv somatic enable het calling (Optional)
# Docs: Enable HET-calling mode for heterogeneous segments.
cnv_somatic_enable_het_calling: false

# cnv somatic enable lower ploidy limit (Optional)
# Docs: To improve accuracy on the tumor ploidy model estimation, the somatic WGS CNV caller estimates whether the chosen model calls
# homozygous deletions on regions that are likely to reduce the overall fitness of cells,
# which are therefore deemed to be "essential" and under negative selection.
# In the current literature, recent efforts tried to map such cell-essential genes (eg, in 2015 - https://www.science.org/doi/10.1126/science.aac7041).
# The check on essential regions is controlled with --cnv-somatic-enable-lower-ploidy-limit (default true).
cnv_somatic_enable_lower_ploidy_limit: false

# cnv somatic essential genes bed (Optional)
# Docs: Default bedfiles describing the essential regions are provided for hg19, GRCh37, hs37d5, GRCh38,
# but a custom bedfile can also be provided in input through the
# --cnv-somatic-essential-genes-bed=<BEDFILE_PATH> parameter.
# In such case, the feature is automatically enabled.
# A custom essential regions bedfile needs to have the following format: 4-column, tab-separated,
# where the first 3 columns identify the coordinates of the essential region (chromosome, 0-based start, excluded end).
# The fourth column is the region id (string type). For the purpose of the algorithm, currently only the first 3 columns are used.
# However, the fourth might be helpful to investigate manually which regions drove the decisions on model plausibility made by the caller.
cnv_somatic_essential_genes_bed: string

# cnv use somatic vc baf (Optional)
# Docs: If running in tumor-normal mode with the SNV caller enabled, use this option
# to specify the germline heterozygous sites.
cnv_use_somatic_vc_baf: false

# cnv use somatic vc vaf (Optional)
# Docs: Use the variant allele frequencies (VAFs) from the somatic SNVs to help select
# the tumor model for the sample.
cnv_use_somatic_vc_vaf: false

# cram input (Optional)
# Docs: Input a normal CRAM file for the variant calling stage
cram_input:
  class: File
  location: icav2://project_id/path/to/file

# cram reference (Optional)
# Docs: Path to the reference fasta file for the CRAM input.
# Required only if the input is a cram file AND not the reference in the tarball
cram_reference:
  class: File
  location: icav2://project_id/path/to/file

# dbsnp annotation (Optional)
# Docs: In Germline, Tumor-Normal somatic, or Tumor-Only somatic modes,
# DRAGEN can look up variant calls in a dbSNP database and add annotations for any matches that it finds there.
# To enable the dbSNP database search, set the --dbsnp option to the full path to the dbSNP database
# VCF or .vcf.gz file, which must be sorted in reference order.
dbsnp_annotation:
  class: File
  location: icav2://project_id/path/to/file

# deduplicate minimum quality (Optional)
# Docs: Specifies the Phred quality score below which a base should be excluded from the quality score
# calculation used for choosing among duplicate reads.
dedup_min_qual: string

# deduplicate minimum quality germline (Optional)
# Docs: Specifies the Phred quality score below which a base should be excluded from the quality score
# calculation used for choosing among duplicate reads.
dedup_min_qual_germline: string

# deduplicate minimum quality somatic (Optional)
# Docs: Specifies the Phred quality score below which a base should be excluded from the quality score
# calculation used for choosing among duplicate reads.
dedup_min_qual_somatic: string

# enable cnv calling (Optional)
# Docs: Enable CNV processing in the DRAGEN Host Software.
enable_cnv: false

# enable cnv germline (Optional)
# Docs: Enable CNV processing in the DRAGEN Host Software (somatic only)
enable_cnv_germline: false

# enable cnv somatic (Optional)
# Docs: Enable CNV processing in the DRAGEN Host Software (germline only)
enable_cnv_somatic: false

# enable duplicate marking (Optional)
# Docs: Enable the flagging of duplicate output
# alignment records.
enable_duplicate_marking: false

# enable duplicate marking germline (Optional)
# Docs: Enable the flagging of duplicate output
# alignment records.
enable_duplicate_marking_germline: false

# enable duplicate marking somatic (Optional)
# Docs: Enable the flagging of duplicate output
# alignment records.
enable_duplicate_marking_somatic: false

# enable hla (Optional)
# Docs: Enable HLA typing by setting --enable-hla flag to true
enable_hla: false

# enable hrd (Optional)
# Docs: Set to true to enable HRD scoring to quantify genomic instability.
# Requires somatic CNV calls.
enable_hrd: false

# enable map align (Optional)
# Docs: Enabled by default since --enable-variant-caller option is set to true.
# Set this value to false if using bam_input
enable_map_align: false

# enable map align germline (Optional)
# Docs: Enabled by default since --enable-variant-caller option is set to true.
# Set this value to false if using bam_input
enable_map_align_germline: false

# enable map align output (Optional)
# Docs: Enables saving the output from the
# map/align stage....
Read more