-
Notifications
You must be signed in to change notification settings - Fork 0
Home
The Data Flow SOP outlines the process of moving from sequencing to making data available in Turbo. This involves organizing data in the Raw_sequencing_folder
.
The repository structure is as follows:
Data-Flow-SOP
├── create_directories.py
├── create_higher_level_dirs.py
├── illumina
│ ├── move_files_to_directories_illumina.py
│ └── rename_samples_illumina.sh
├── nanopore
│ ├── move_files_to_dirs_nanopore.py
│ └── rename_samples_nanopore.sh
├── pics
│ ├── globus-raw-fastq-nanopore.png
│ ├── globus_raw_fastq.png
│ ├── transfer-nanoQC-results.png
│ └── transfer-qcd-results.png
├── processing-hybrid-samples.md
├── processing-illumina-samples.md
├── processing-nanopore-samples.md
└── README.md
-
create_directories.py
A Python script to create Project folder for organizing Illumina and Nanopore data (see New Sequence Data Structure) -
create_higher_level_dirs.py
A Python script to set up a newSequence_data
directory structure (see New Sequence Data Structure). -
illumina/move_files_to_directories_illumina.py
Moves Illumina samples that pass QC metrics based on the QCD pipeline. -
illumina/rename_samples_illumina.sh
A Bash script to rename Illumina samples according to predefined rules. -
nanopore/move_files_to_dirs_nanopore.py
Moves Nanopore samples that pass QC metrics based on the nanoQC pipeline. -
nanopore/rename_samples_nanopore.sh
A Bash script to rename long-read Nanopore samples based on specific criteria. -
processing-hybrid-samples.md
Contains instructions for processing hybrid samples. -
processing-illumina-samples.md
Guides users through processing Illumina (short-read) data. -
processing-nanopore-samples.md
Contains steps for processing Nanopore (long-read) data. -
README.md
Provides an overview of the repository and instructions for usage.
-
illumina/
Contains scripts for moving and renaming Illumina samples. -
nanopore/
Includes scripts for organizing and renaming Nanopore samples. -
pics/
Stores screenshots used in documentation for processing Illumina and Nanopore data.
Important: If your project directory already contains a Sequence_data
folder, rename it to old_Sequence_data
before proceeding.
The new Sequence_data
directory structure created by the create_higher_level_dirs.py
script is organized as follows:
/Users/Dhatrib/Desktop/Project_Test/Sequence_data/
├── assembly
│ └── illumina
├── illumina_fastq
├── metadata
│ ├── AGC_submission
│ ├── plasmidsaurus
│ └── sample_lookup
└── variant_calling
The create_directories.py
script further organizes the directory structure, as shown below:
/Users/Dhatrib/Desktop/Project_Test/Sequence_data/
├── assembly
│ └── illumina
├── illumina_fastq
│ ├── 2025-01-24_Plate1-to-Plate3
│ │ ├── failed_qc_samples
│ │ ├── neg_ctrl
│ │ ├── passed_qc_samples
│ │ └── raw_fastq
│ └── clean_fastq_qc_pass_samples
├── metadata
│ ├── AGC_submission
│ ├── plasmidsaurus
│ └── sample_lookup
└── variant_calling