You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+18-18Lines changed: 18 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,9 +25,9 @@ The following conventions are used in this document to differentiate between a w
25
25
Pipeline overview
26
26
=================
27
27
28
-
The analysis pipeline takes as input a .sam file containing BWA alignments of trimmed reads for a sample and generates a) annotations of the reads, b) expression profiles of features annotated in the sample, c) summarized miRNA expression as TCGA (The Cancer Genome Atlas) formatted expression quantification reports and as expression matrices, and d) graphs of overall feature representation, filtered results, and quality metrics. The miRNA analysis pipeline version v0.2.7 has been tested with BWA 0.5.7.
28
+
The analysis pipeline takes as input a .sam file containing BWA alignments of trimmed reads for a sample and generates a) annotations of the reads, b) expression profiles of features annotated in the sample, c) summarized miRNA expression as TCGA (The Cancer Genome Atlas) formatted expression quantification reports and as expression matrices, and d) graphs of overall feature representation, filtered results, and quality metrics. The miRNA analysis pipeline version v0.2.8 has been tested with BWA 0.5.7.
29
29
30
-
External applications required to run the pipeline can be placed in the apps folder in the top pipeline directory. e.g. <BASEDIR>/v0.2.7/apps/. Although applications do not have to be stored in this directory, Perl (perl-5.10-x86\_64) and R (R-2.12.0) must be available on your system. In addition, Perl requires the MySQL DBI library. R is used to generate summary graphs, and may be disregarded if graphs are not desired.
30
+
External applications required to run the pipeline can be placed in the apps folder in the top pipeline directory. e.g. <BASEDIR>/v0.2.8/apps/. Although applications do not have to be stored in this directory, Perl (perl-5.10-x86\_64) and R (R-2.12.0) must be available on your system. In addition, Perl requires the MySQL DBI library. R is used to generate summary graphs, and may be disregarded if graphs are not desired.
31
31
32
32
Pipeline parameters
33
33
===================
@@ -150,7 +150,7 @@ There are two configuration files which must be modified:
150
150
-**db\_connections.cfg** contains database settings to access MySQL databases containing the necessary UCSC and miRBase information. You must have a database connection to a miRBase instance and a UCSC database instance for annotations of miRNAs and other non-coding RNAs respectively. <db\_name> field provides the database source for various script parameters <host> is the server name of the database. <user> and <password> are the user-specific login and password
151
151
-**pipeline\_params.cfg** is provided for additional, optional settings such as the path to an Rscript binary for the optional graphing functions, e.g. Rscript=<BASEDIR>/apps/R-2.12.0/lib64/R/bin/Rscript
152
152
153
-
There is one additional script that must be modified with the path to perl 5. Modify **profile.sh** so that it points to the appropriate Perl to use for the scripts. Run source on this to generate the environment, e.g. {cluster-host}~> source <BASEDIR>/v0.2.7/config/profile.sh
153
+
There is one additional script that must be modified with the path to perl 5. Modify **profile.sh** so that it points to the appropriate Perl to use for the scripts. Run source on this to generate the environment, e.g. {cluster-host}~> source <BASEDIR>/v0.2.8/config/profile.sh
154
154
155
155
Annotate the .sam files
156
156
========================
@@ -187,7 +187,7 @@ Appropriate miRbase and UCSC database access information (database name, host, u
187
187
The annotation script requires the names of the desired miRbase and UCSC databases listed in db\_connections.cfg, the miRbase species code to use (e.g. hsa for human), and the path to the top project directory (i.e. the directory where the LIBID subdirectories are located).
#### Run library\_stats/tcga/expression\_matrix\_mimat.pl
335
335
336
336
```bash
337
-
{dbhost or xhost}~> perl <BASEDIR>/v0.2.7/code/library_stats/tcga/expression_matrix_mimat.pl -m <miRNA_adf_file> -p <Level_3_archive_directory>
337
+
{dbhost or xhost}~> perl <BASEDIR>/v0.2.8/code/library_stats/tcga/expression_matrix_mimat.pl -m <miRNA_adf_file> -p <Level_3_archive_directory>
338
338
```
339
339
-m miRNA\_ADF is the adf file generated by create\_adf.pl
340
340
-p the directory where the \*.isoform.quantification.txt files can be found. The script will loop through all isoform.quantification.txt files that it finds in this directory to build the expression matrix.
@@ -351,7 +351,7 @@ The alignment\_stats.csv file and the report files located in each \_features su
Make sure that if running this script through an ssh tunnel, X11 forwarding is enabled so that the R graphing libraries can be used. e.g. ~> ssh -X <MYSERVER>
357
357
@@ -366,16 +366,16 @@ The miRNA annotation pipeline requires the following scripts and software
#### **Description of online documentation files**
372
372
373
-
-*Pre-alignment processing:* A general description of how adapter trimming is performed is provided in the "1a. Preprocessing" section of the pipeline description at: <BASEDIR>/v0.2.7/DESCRIPTION.txt. Additional description and usage information is available through the publicly available archive http://www.bcgsc.ca/platform/bioinfo/software/adapter-trimming-for-small-rna-sequencing
374
-
-*Annotation:* Overview of miRNA analysis at the BCCA-GSC: <BASEDIR>/v0.2.7/DESCRIPTION.txt. Script usage detailed: <BASEDIR>/v0.2.7/code/annotation/HOWTO.txt
-*Expression:* TCGA formatted miRNA expression reports: <BASEDIR>/v0.2.7/custom\_output/tcga/README.txt File formats for TCGA data files are also available online at: <https://wiki.nci.nih.gov/display/TCGA/miRNASeq
-*Pre-alignment processing:* A general description of how adapter trimming is performed is provided in the "1a. Preprocessing" section of the pipeline description at: <BASEDIR>/v0.2.8/DESCRIPTION.txt. Additional description and usage information is available through the publicly available archive http://www.bcgsc.ca/platform/bioinfo/software/adapter-trimming-for-small-rna-sequencing
374
+
-*Annotation:* Overview of miRNA analysis at the BCCA-GSC: <BASEDIR>/v0.2.8/DESCRIPTION.txt. Script usage detailed: <BASEDIR>/v0.2.8/code/annotation/HOWTO.txt
-*Expression:* TCGA formatted miRNA expression reports: <BASEDIR>/v0.2.8/custom\_output/tcga/README.txt File formats for TCGA data files are also available online at: <https://wiki.nci.nih.gov/display/TCGA/miRNASeq
Annotation priorities used to resolve multiple matches, from highest to lowest priority. If a read has more than 1 alignment, and the annotations are different, the priorities from Table 1 are used as long as only 1 alignment is to a miRNA. If there are multiple alignments to different miRNAs (or even different regions of the same miRNA), the read is flagged as cross-mapped and every miRNA annotation is preserved. Using these rules, the reads are summed by annotation, and coverage reports are generated.*
Copy file name to clipboardExpand all lines: v0.2.8/README.TXT
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
miRNA Analysis Pipeline v0.2.7
1
+
miRNA Analysis Pipeline v0.2.8
2
2
3
3
The TCGA miRNAseq data generation process, including strand-specific library construction, sequencing, and computational processing is described in:
4
4
Chu A, Robertson G, Brooks D, Mungall AJ, Birol I, Coope R, Ma Y, Jones S, Marra MA. Large-scale profiling of microRNAs for The Cancer Genome Atlas. Nucleic Acids Res. 2015 Aug 13.
0 commit comments