Skip to content

Sydney-Informatics-Hub/ONT-bacpac-nf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ONT-bacpac-nf

🔧 THIS PIPELINE IS UNDER ACTIVE DEVELOPMENT 🔧

WIP title, WIP workflow.

Workflow description

A rapid and portable workflow for pond-side sequencing of bacterial pathogens for sustainable aquaculture using ONT long-read sequencing.

User guide

For Gadi specific docs see docs/gadi-execution.md

Developer notes

Fast, iterative testing is best done within an interactive session on Gadi. Start an interactive session with the following command:

qsub -I -P <PROJECT> -lwalltime=2:00:00 -lmem=190GB -lncpus=24 -qnormal -lstorage=scratch/<PROJECT>

Once the session starts, you'll need to move back to your ONT-bacpac-nf directory. Execute the pipeline with:

bash test/run_test.sh

Keep in mind:

  • No external network access on job queues, except copyq
  • Downloading kraken2 database is currently slowest step, so its best to download once and reuse it by providing the --kraken2_db parameter to the pipeline
  • Will explore faster and more secure download method for all reference datasets with aria2

Ensure you do the following:

  • Break down tasks into modules, with distinct functions
  • Clearly define input and output channels with descrptive names
  • Add comments within your code to explain the logic and purpose of complex sections
  • Use configuration files to manage parameters, separating code from configuration
  • Provide sensible default values
  • Design modules to fail early if prerequisites are not met or if an error occurs
  • Implement comprehensive Groovy logging with log.info
  • Specify resource requirements for each process
  • Use dynamic resource handling to adjust resource requests based on input data size, where possible
  • Use Singularity to excecute biocontainers or Wave containers
  • Exploit parallelism by designing processes to run concurrently wherever possible
  • Consult the benchmarking results to optimise process resource requirements

Please use this structure for modules and saves these files as run_process.nf:

process process_name {
  tag "ADD A TAG THAT CAPTURES TASK LEVEL INFO"
  container '<link to container>'

  input:
	tuple val(barcode), path <input>

  output:
  path("*"), emit: process_out

  script: 
  """
  # EXPLAIN THE PROCESS 
  ADD CODE 
  """
}

Component tools

Additional notes

Help / FAQ / Troubleshooting

License(s)

Acknowledgements/citations/credits

About

Bacterial profiling workflow for ONT data, written in Nextflow.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published