-
Notifications
You must be signed in to change notification settings - Fork 1
Integrative Analysis Recipe
Stephan Reichl edited this page Aug 7, 2024
·
2 revisions
The Integrative ATAC-seq & RNA-seq Analysis Recipe takes you from unaligned (raw) BAM files derived from matched bulk RNA-seq and bulk ATAC-seq experiments to enrichment analysis results of deviating genes between the modalities (termed epigenetic potential vs relative transcriptional abundance) while providing unsupervised analyses of the integrated dataset and genome browser tracks for quality control.
flowchart LR;
ngs_fetch_RNA-->rnaseq_pipeline;
ngs_fetch_ATAC-->atacseq_pipeline;
rnaseq_pipeline-->genome_tracks_RNA;
atacseq_pipeline-->genome_tracks_ATAC;
rnaseq_pipeline-->spilterlize_integrate;
atacseq_pipeline-->spilterlize_integrate;
spilterlize_integrate-->unsupervised_analysis;
spilterlize_integrate-->dea_limma;
dea_limma-->enrichment_analysis;
The following Modules are used in this Recipe:
- (optional) Fetch publicly available bulk RNA-seq and bulk ATAC-seq data (coming soon).
- (coming soon) RNA-seq pipeline to quantify gene expression, resulting in count matrices and annotations.
- ATAC-seq pipeline to quantify chromatin accessibility at the promoter/TSS locus of each gene, resulting in count matrices and annotations.
- Genome Browser Track Visualization for quality control and visual analysis of genomic regions of interest or top hits from downstream analyses.
- Split, Filter, Normalize and Integrate Sequencing Data to integrate the two modalities into one common feature space and prepare for downstream analysis.
- Unsupervised Analysis for quality control and to understand and visualize similarities and variations between samples of the integrated dataset in the shared feature space.
- Differential Analysis with limma to identify and visualize statistically significant gene-promoter-pairs (deviating genes) that differ between modalities despite computational integration.
- Enrichment Analysis for biomedical interpretation of differential analysis results using prior knowledge.
(only notes)
- Input
- matched bulk RNA-seq and bulk ATAC-seq samples
- Output
- integrated dataset using spilterlize_integrate
- deviating genes (epigenetic potential vs relative transcriptional abundance) using dea_limma
- differential programs and TFs using enrichment_analysis
- Features
- integrate into the same feature space (eg by mapping of ATAC-seq reads to promoters, normalization, assay effect removal,…)
- determination of deviating genes ie epigenetic potential and relative transcriptional abundance by differential analysis between assays
- Provide integrated analysis of RNA and ATACseq as a recipe
- first, RNA and ATAC seq pipelines are run
- put into split, filter, normalized and integrate → especially integrating across the differential modalities.
- unsupervised analysis of data set to show/check integration
- Then differential analysis module limma is used for finding deviating genes ie epigenetic potential and relative transcription abundance
- genome track plots of RNA & ATAC for top hits, visually explaining epigenetic potential and relative transcriptional abundance
- finally, enrichment analysis (including TFBS) for biological interpretation and gene regulation
Templates for a Methods section of a scientific publication can be found in each Module's README.
--- COMING SOON ---
--- COMING SOON ---
--- COMING SOON ---