-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Rina Ahmed-Begrich
committed
Jul 22, 2024
0 parents
commit b4d5519
Showing
39 changed files
with
1,690 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# EditorConfig is awesome: http://EditorConfig.org | ||
|
||
# top-most EditorConfig file | ||
root = true | ||
|
||
[*] | ||
end_of_line = lf | ||
insert_final_newline = true | ||
charset = utf-8 | ||
indent_style = space | ||
indent_size = 4 | ||
|
||
[*.{yml,yaml}] | ||
indent_style = space | ||
indent_size = 2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
*.smk linguist-language=Python | ||
Snakefile linguist-language=Python |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
name: PR | ||
on: | ||
pull_request_target: | ||
types: | ||
- opened | ||
- reopened | ||
- edited | ||
- synchronize | ||
|
||
jobs: | ||
title-format: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: amannn/[email protected] | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
with: | ||
validateSingleCommit: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
name: Tests | ||
|
||
on: | ||
push: | ||
branches: [ main ] | ||
pull_request: | ||
branches: [ main ] | ||
|
||
|
||
jobs: | ||
Formatting: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Formatting | ||
uses: github/super-linter@v4 | ||
env: | ||
VALIDATE_ALL_CODEBASE: false | ||
DEFAULT_BRANCH: main | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
VALIDATE_SNAKEMAKE_SNAKEFMT: true | ||
|
||
Linting: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Lint workflow | ||
uses: snakemake/[email protected] | ||
with: | ||
directory: . | ||
snakefile: workflow/Snakefile | ||
args: "--lint" | ||
|
||
Testing: | ||
runs-on: ubuntu-latest | ||
needs: | ||
- Linting | ||
- Formatting | ||
steps: | ||
- uses: actions/checkout@v2 | ||
|
||
- name: Test workflow | ||
uses: snakemake/[email protected] | ||
with: | ||
directory: .test | ||
snakefile: workflow/Snakefile | ||
args: "--use-conda --show-failed-logs --cores 3 --conda-cleanup-pkgs cache --all-temp" | ||
|
||
- name: Test report | ||
uses: snakemake/[email protected] | ||
with: | ||
directory: .test | ||
snakefile: workflow/Snakefile | ||
args: "--report report.zip" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
on: | ||
push: | ||
branches: | ||
- main | ||
|
||
name: release-please | ||
|
||
jobs: | ||
release-please: | ||
runs-on: ubuntu-latest | ||
steps: | ||
|
||
- uses: GoogleCloudPlatform/release-please-action@v2 | ||
id: release | ||
with: | ||
release-type: go # just keep a changelog, no version anywhere outside of git tags | ||
package-name: <repo> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
results/** | ||
resources/** | ||
.snakemake | ||
.snakemake/** | ||
.venv/** | ||
.env/** | ||
.DS_Store | ||
!.gitignore | ||
!.gitattributes | ||
!.editorconfig | ||
|
||
# Custom additions | ||
Notes.md | ||
.vscode/* | ||
.snakemake-workflow-catalog.yml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
|
||
# optional: define output folder here | ||
# default: "./results" | ||
output: null | ||
|
||
# define samplesheet here | ||
samplesheet: "./test/config/samples.tsv" | ||
|
||
get_genome: | ||
database: "ncbi" | ||
assembly: "GCF_000006785.2" | ||
fasta: Null | ||
gff: Null | ||
|
||
cutadapt: | ||
fivep_adapter: Null | ||
threep_adapter: "ATCGTAGATCGGAAGAGCACACGTCTGAA" | ||
default: ["-q 10 ", "-m 22 ", "-M 52", "--overlap=3"] | ||
|
||
umi_extraction: | ||
method: "regex" | ||
pattern: "^(?P<umi_0>.{2}).*(?P<umi_1>.{5})$" | ||
umi_dedup: ["--edit-distance-threshold=0", "--read-length"] | ||
|
||
star: | ||
index: Null | ||
genomeSAindexNbases: 9 | ||
multi: 10 | ||
sam_multi: 1 | ||
intron_max: 1 | ||
default: [ | ||
"--readFilesCommand zcat ", | ||
"--outSAMstrandField None ", | ||
"--outSAMattributes All ", | ||
"--outSAMattrIHstart 0 ", | ||
"--outFilterType Normal ", | ||
"--outFilterMultimapScoreRange 1 ", | ||
"-o STARmappings ", | ||
"--outSAMtype BAM Unsorted ", | ||
"--outStd BAM_Unsorted ", | ||
"--outMultimapperOrder Random ", | ||
"--alignEndsType EndToEnd"] | ||
|
||
extract_features: | ||
biotypes: ["rRNA", "tRNA"] | ||
CDS: ["protein_coding"] | ||
|
||
bedtools_intersect: | ||
defaults: ["-v ", "-s ", "-f 0.2"] | ||
|
||
deeptools: | ||
libtype: "sense" # parameter not used yet. Only sense direction currently handled # TODO # | ||
bin_size: 1 | ||
normalize: "CPM" | ||
|
||
annotate_orfs: | ||
window_size: 30 | ||
sorf_max_length: 300 | ||
sorf_min_length: 45 | ||
orf_start_codon_table: 11 | ||
orf_stop_codon: ["TAA", "TAG", "TGA"] | ||
orf_longest_only: False | ||
|
||
shift_reads: | ||
window_size: 30 | ||
read_length: [27, 45] | ||
# rpf_read_length: [30, 45] | ||
# qti_read_length: [27, 45] | ||
rnaseq_read_length: [0, 1000] | ||
end_alignment: "3prime" | ||
shift_table: "config/shift_table/shift_table.csv" | ||
export_bam: False | ||
export_bigwig: True | ||
skip_shifting: False | ||
skip_length_filter: True | ||
|
||
|
||
multiqc: | ||
config: "config/multiqc_config.yml" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
sample condition replicate lib_prep data_folder fq1 | ||
RPF-RTP1 RPF-RTP 1 mpusp .test/data RPF-RTP1_R1_001.fastq.gz | ||
RPF-RTP2 RPF-RTP 2 mpusp .test/data RPF-RTP2_R1_001.fastq.gz | ||
|
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
Copyright License for non-commercial scientific research purposes | ||
|
||
Please read carefully the following terms and conditions and any accompanying documentation before you download and/or use files from this repository (Files). By downloading and/or using the Files, you acknowledge that you have read these terms and conditions, understand them, and agree to be bound by them. If you do not agree with these terms and conditions, you must not download and/or use the Files. Any infringement of the terms of this agreement will automatically terminate your rights under this License | ||
|
||
Ownership / Licensees | ||
The Files and the associated materials have been developed at the Max Planck Unit for the Science of Pathogens (hereinafter "MPUSP") by members of the Bioinformatics Platform (hereinafter “developers”). | ||
Any copyright or patent right is owned by and proprietary material of the Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (hereinafter “MPG”; MPUSP and MPG hereinafter collectively “Max-Planck”) hereinafter the “Licensor”. | ||
|
||
License Grant | ||
Licensor grants you (Licensee) personally a single-user, non-exclusive, non-transferable, free of charge right: | ||
• To download the Files on computers owned, leased or otherwise controlled by you and/or your organization; | ||
• To use the Files for the sole purpose of performing non-commercial scientific research, non-commercial education, or non-commercial artistic projects; | ||
• To modify, adapt, translate or create derivative works based upon the Files. | ||
Any other use, in particular any use for commercial purposes, is prohibited. This includes, without limitation, incorporation in a commercial product, use in a commercial service, or production of other artefacts for commercial purposes. The Files may not be reproduced, modified and/or made available in any form to any third party without Max-Planck’s prior written permission. Third party funded research at academic institution is considered non-Commercial Use. By downloading the Files, you agree not to reverse engineer it. | ||
|
||
The Licensee is encouraged to provide feedback on the use, results, modifications, bugs and technical questions or publications to [email protected]. | ||
|
||
No Distribution | ||
The Files and the license herein granted shall not be copied, shared, distributed, re-sold, offered for re-sale, transferred or sub-licensed in whole or in part except that you may make one copy for archive purposes only and the use of the Files by Other researchers at your institution under the conditions of this license. | ||
|
||
Disclaimer of Representations and Warranties | ||
You expressly acknowledge and agree that the Files results from basic research, is provided “AS IS”, may contain errors, and that any use of the Files is at your sole risk. LICENSOR MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND CONCERNING THE FILES, NEITHER EXPRESS NOR IMPLIED, AND THE ABSENCE OF ANY LEGAL OR ACTUAL DEFECTS, WHETHER DISCOVERABLE OR NOT. Specifically, and not to limit the foregoing, licensor makes no representations or warranties (i) regarding the merchantability or fitness for a particular purpose of the Files, (ii) that the use of the Files will not infringe any patents, copyrights or other intellectual property rights of a third party, and (iii) that the use of the Files will not cause any damage of any kind to you or a third party. | ||
|
||
Limitation of Liability | ||
Because this File License Agreement qualifies as a donation, according to Section 521 of the German Civil Code (Bürgerliches Gesetzbuch – BGB) Licensor as a donor is liable for intent and gross negligence only. If the Licensor fraudulently conceals a legal or material defect, they are obliged to compensate the Licensee for the resulting damage. | ||
Licensor shall be liable for loss of data only up to the amount of typical recovery costs which would have arisen had proper and regular data backup measures been taken. The foregoing applies also to Licensor’s legal representatives or assistants in performance. Any further liability shall be excluded. Patent claims generated through the usage of the Files cannot be directed towards the copyright holders. | ||
The Files are provided in state of development the licensor defines. If modified or extended by Licensee, the Licensor makes no claims about the fitness of the Files and is not responsible for any problems such modifications cause. | ||
|
||
No Maintenance Services | ||
You understand and agree that Licensor is under no obligation to provide either maintenance services, update services, notices of latent defects, or corrections of defects with regard to the Files. Licensor nevertheless reserves the right to update, modify, or discontinue the Software at any time. | ||
Defects of the Files must be notified in writing to the Licensor with a comprehensible description of the error symptoms. The notification of the defect should enable the reproduction of the error. The Licensee is encouraged to communicate any use, results, modification or publication. | ||
|
||
Publications using the Files | ||
You acknowledge that the Files are a valuable scientific resource and agree to appropriately acknowledge MPUSP in any publication making use of the Files. | ||
|
||
Commercial licensing opportunities | ||
For commercial uses of the Files, please send an email to [email protected]. | ||
|
||
Miscellaneous | ||
This Agreement shall be governed by the laws of the Federal Republic of Germany except for the UN Sales Convention. | ||
This License Text is itself licensed under Creative Commons NC BY NA 4.0 https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# snakemake-bacterial-riboseq | ||
|
||
![Platform](https://img.shields.io/badge/platform-all-green) | ||
[![Snakemake](https://img.shields.io/badge/snakemake-≥7.0.0-brightgreen.svg)](https://snakemake.github.io) | ||
[![GitHub actions status](https://github.com/MPUSP/snakemake-bacterial-riboseq/workflows/Tests/badge.svg?branch=main)](https://github.com/MPUSP/snakemake-bacterial-riboseq/actions?query=branch%3Amain+workflow%3ATests) | ||
[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/) | ||
|
||
--- | ||
|
||
A Snakemake workflow for the analysis of bacterial riboseq data. | ||
|
||
- [snakemake-bacterial-riboseq](#snakemake-bacterial-riboseq) | ||
- [Usage](#usage) | ||
- [Workflow overview](#workflow-overview) | ||
- [Installation](#installation) | ||
- [Authors](#authors) | ||
- [References](#references) | ||
|
||
## Usage | ||
|
||
The usage of this workflow is described in the [Snakemake Workflow Catalog](https://snakemake.github.io/snakemake-workflow-catalog/?usage=MPUSP%2Fsnakemake-bacterial-riboseq). | ||
|
||
If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) repository and its DOI (see above). | ||
|
||
## Workflow overview | ||
|
||
TODO: include first part of the figure here. | ||
|
||
## Installation | ||
|
||
**Step 1: Clone this repository** | ||
|
||
```bash | ||
git clone https://github.com/MPUSP/snakemake-bacterial-riboseq.git | ||
cd snakemake-bacterial-riboseq | ||
``` | ||
|
||
**Step 2: Install dependencies** | ||
|
||
It is recommended to install snakemake and run the workflow with `conda`, `mamba` or `micromamba`. | ||
|
||
```bash | ||
# download Miniconda3 installer | ||
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh | ||
# install Conda (respond by 'yes') | ||
bash miniconda.sh | ||
# update Conda | ||
conda update -y conda | ||
# install Mamba | ||
conda install -n base -c conda-forge -y mamba | ||
``` | ||
|
||
**Step 3: Create snakemake environment** | ||
|
||
This step creates a new conda environment called `snakemake-bacterial-riboseq`. | ||
|
||
```bash | ||
# create new environment with dependencies & activate it | ||
mamba env create -c conda-forge -c bioconda -n snakemake-bacterial-riboseq snakemake pandas | ||
conda activate snakemake-bacterial-riboseq | ||
``` | ||
|
||
### Additional tools | ||
|
||
**Important note:** | ||
|
||
All other dependencies for the workflow are **automatically pulled as `conda` environments** by snakemake, when running the workflow with the `--use-conda` parameter (recommended). | ||
|
||
|
||
## Authors | ||
|
||
- Dr. Rina Ahmed-Begrich | ||
- Affiliation: [Max-Planck-Unit for the Science of Pathogens](https://www.mpusp.mpg.de/) (MPUSP), Berlin, Germany | ||
- ORCID profile: https://orcid.org/0000-0002-0656-1795 | ||
- Dr. Michael Jahn | ||
- Affiliation: [Max-Planck-Unit for the Science of Pathogens](https://www.mpusp.mpg.de/) (MPUSP), Berlin, Germany | ||
- ORCID profile: https://orcid.org/0000-0002-3913-153X | ||
- github page: https://github.com/m-jahn | ||
|
||
|
||
Visit the MPUSP github page at https://github.com/MPUSP for more info on this workflow and other projects. | ||
|
||
## References | ||
|
||
- Essential tools are linked in the top section of this document | ||
- The sequencing library preparation is based on the publication: | ||
> McGlincy, N. J., & Ingolia, N. T. _Transcriptome-wide measurement of translation by ribosome profiling_. Methods, 126, 112–129, **2017**. https://doi.org/10.1016/J.YMETH.2017.05.028. |
Oops, something went wrong.