Skip to content

Commit 2bb59e5

Browse files
committed
Created bug fix release v0.2.8
1 parent 9cde353 commit 2bb59e5

24 files changed

+19
-19
lines changed

README.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@ The following conventions are used in this document to differentiate between a w
2525
Pipeline overview
2626
=================
2727

28-
The analysis pipeline takes as input a .sam file containing BWA alignments of trimmed reads for a sample and generates a) annotations of the reads, b) expression profiles of features annotated in the sample, c) summarized miRNA expression as TCGA (The Cancer Genome Atlas) formatted expression quantification reports and as expression matrices, and d) graphs of overall feature representation, filtered results, and quality metrics. The miRNA analysis pipeline version v0.2.7 has been tested with BWA 0.5.7.
28+
The analysis pipeline takes as input a .sam file containing BWA alignments of trimmed reads for a sample and generates a) annotations of the reads, b) expression profiles of features annotated in the sample, c) summarized miRNA expression as TCGA (The Cancer Genome Atlas) formatted expression quantification reports and as expression matrices, and d) graphs of overall feature representation, filtered results, and quality metrics. The miRNA analysis pipeline version v0.2.8 has been tested with BWA 0.5.7.
2929

30-
External applications required to run the pipeline can be placed in the apps folder in the top pipeline directory. e.g. <BASEDIR>/v0.2.7/apps/. Although applications do not have to be stored in this directory, Perl (perl-5.10-x86\_64) and R (R-2.12.0) must be available on your system. In addition, Perl requires the MySQL DBI library. R is used to generate summary graphs, and may be disregarded if graphs are not desired.
30+
External applications required to run the pipeline can be placed in the apps folder in the top pipeline directory. e.g. <BASEDIR>/v0.2.8/apps/. Although applications do not have to be stored in this directory, Perl (perl-5.10-x86\_64) and R (R-2.12.0) must be available on your system. In addition, Perl requires the MySQL DBI library. R is used to generate summary graphs, and may be disregarded if graphs are not desired.
3131

3232
Pipeline parameters
3333
===================
@@ -150,7 +150,7 @@ There are two configuration files which must be modified:
150150
- **db\_connections.cfg** contains database settings to access MySQL databases containing the necessary UCSC and miRBase information. You must have a database connection to a miRBase instance and a UCSC database instance for annotations of miRNAs and other non-coding RNAs respectively. &lt;db\_name&gt; field provides the database source for various script parameters &lt;host&gt; is the server name of the database. &lt;user&gt; and &lt;password&gt; are the user-specific login and password
151151
- **pipeline\_params.cfg** is provided for additional, optional settings such as the path to an Rscript binary for the optional graphing functions, e.g. Rscript=&lt;BASEDIR&gt;/apps/R-2.12.0/lib64/R/bin/Rscript
152152

153-
There is one additional script that must be modified with the path to perl 5. Modify **profile.sh** so that it points to the appropriate Perl to use for the scripts. Run source on this to generate the environment, e.g. {cluster-host}~> source <BASEDIR>/v0.2.7/config/profile.sh
153+
There is one additional script that must be modified with the path to perl 5. Modify **profile.sh** so that it points to the appropriate Perl to use for the scripts. Run source on this to generate the environment, e.g. {cluster-host}~> source <BASEDIR>/v0.2.8/config/profile.sh
154154

155155
Annotate the .sam files
156156
========================
@@ -187,7 +187,7 @@ Appropriate miRbase and UCSC database access information (database name, host, u
187187
The annotation script requires the names of the desired miRbase and UCSC databases listed in db\_connections.cfg, the miRbase species code to use (e.g. hsa for human), and the path to the top project directory (i.e. the directory where the LIBID subdirectories are located).
188188

189189
```bash
190-
{cluster-host}~> perl <BASEDIR>/v0.2.7/code/annotation/annotate.pl -m <mirbase> -u <ucsc_database> -o <species_code> -p <PROJECT>
190+
{cluster-host}~> perl <BASEDIR>/v0.2.8/code/annotation/annotate.pl -m <mirbase> -u <ucsc_database> -o <species_code> -p <PROJECT>
191191
```
192192
#### **Output:**
193193

@@ -211,7 +211,7 @@ Annotated &lt;LIBID&gt;\_&lt;INDEX&gt;.sam files in the &lt;LIBID&gt; subdirecto
211211
#### **Run alignment\_stats.pl**
212212

213213
```perl
214-
{dbhost or xhost}~> perl <BASEDIR>/v0.2.7/code/library_stats/alignment_stats.pl -p <PROJECT>
214+
{dbhost or xhost}~> perl <BASEDIR>/v0.2.8/code/library_stats/alignment_stats.pl -p <PROJECT>
215215
```
216216
#### **Output**
217217

@@ -241,7 +241,7 @@ expression files from &lt;PROJECT&gt;/&lt;LIBID&gt;/&lt;LIBID&gt;\_&lt;INDEX&gt;
241241
#### **Run tcga.pl**
242242

243243
```bash
244-
{dbhost or xhost}~> perl <BASEDIR>/v0.2.7/code/custom\_output/tcga.pl -p <PROJECT>
244+
{dbhost or xhost}~> perl <BASEDIR>/v0.2.8/code/custom\_output/tcga.pl -p <PROJECT>
245245
```
246246
#### **Output**
247247

@@ -279,7 +279,7 @@ mirna\_species.txt files
279279
**Run expression\_matrix.pl**
280280

281281
```bash
282-
{dbhost or xhost}~> perl <BASEDIR>/v0.2.7/code/library\_stats/expression\_matrix.pl -m <mirbase> -o <mirbase\_species\_code> -p <PROJECT>
282+
{dbhost or xhost}~> perl <BASEDIR>/v0.2.8/code/library\_stats/expression\_matrix.pl -m <mirbase> -o <mirbase\_species\_code> -p <PROJECT>
283283
```
284284

285285
-m is the miRBase database to use, e.g. mirna\_20
@@ -304,7 +304,7 @@ crossmapped.txt files
304304
#### Run library\_stats/expression\_matrix\_mimat.pl
305305

306306
```bash
307-
{dbhost}~> perl <BASEDIR>/v0.2.7/code/library\_stats/tcga/expression\_matrix\_mimat.pl -m <mirbase> -o <mirbase\_species\_code> -p <PROJECT>
307+
{dbhost}~> perl <BASEDIR>/v0.2.8/code/library\_stats/tcga/expression\_matrix\_mimat.pl -m <mirbase> -o <mirbase\_species\_code> -p <PROJECT>
308308
```
309309
-m is the miRBase database to use, e.g. mirna\_20
310310
-o is the species code used by miRBase for the desired organism. For human, use hsa
@@ -321,7 +321,7 @@ Either rename the isoform.quantification.txt files you want included in the expr
321321
#### Generate the miRNA adf text file
322322

323323
```bash
324-
{dbhost or xhost}~> perl <BASEDIR>/v0.2.7/code/library_stats/tcga/create_adf.pl -m <mirbase> -o <mirbase_species_code> -g <genome_version> -v <mirbase_adf_file>
324+
{dbhost or xhost}~> perl <BASEDIR>/v0.2.8/code/library_stats/tcga/create_adf.pl -m <mirbase> -o <mirbase_species_code> -g <genome_version> -v <mirbase_adf_file>
325325
```
326326
-m is the miRBase database to use as listed in the db\_connections.cfg file, e.g. mirna\_20
327327
-o is the species code used by miRBase for the desired organism. For human, use hsa.
@@ -334,7 +334,7 @@ miRNA\_ID miRBase\_Ver Accession Genomic\_Coords Precursor\_Seq Mature\_Coords M
334334
#### Run library\_stats/tcga/expression\_matrix\_mimat.pl
335335

336336
```bash
337-
{dbhost or xhost}~> perl <BASEDIR>/v0.2.7/code/library_stats/tcga/expression_matrix_mimat.pl -m <miRNA_adf_file> -p <Level_3_archive_directory>
337+
{dbhost or xhost}~> perl <BASEDIR>/v0.2.8/code/library_stats/tcga/expression_matrix_mimat.pl -m <miRNA_adf_file> -p <Level_3_archive_directory>
338338
```
339339
-m miRNA\_ADF is the adf file generated by create\_adf.pl
340340
-p the directory where the \*.isoform.quantification.txt files can be found. The script will loop through all isoform.quantification.txt files that it finds in this directory to build the expression matrix.
@@ -351,7 +351,7 @@ The alignment\_stats.csv file and the report files located in each \_features su
351351
#### Run graph\_libs.pl
352352

353353
```bash
354-
{xhost}~> perl <BASEDIR>/v0.2.7/code/library_stats/graph_libs.pl -p <PROJECT>
354+
{xhost}~> perl <BASEDIR>/v0.2.8/code/library_stats/graph_libs.pl -p <PROJECT>
355355
```
356356
Make sure that if running this script through an ssh tunnel, X11 forwarding is enabled so that the R graphing libraries can be used. e.g. ~> ssh -X &lt;MYSERVER&gt;
357357

@@ -366,16 +366,16 @@ The miRNA annotation pipeline requires the following scripts and software
366366
- **adapter trimming script**: http://www.bcgsc.ca/platform/bioinfo/software/adapter-trimming-for-small-rna-sequencing
367367
- **BWA 0.5.7**
368368
- **samtools 0.1.7**
369-
- **miRNA profiling 0.2.7**
369+
- **miRNA profiling 0.2.8**
370370

371371
#### **Description of online documentation files**
372372

373-
- *Pre-alignment processing:* A general description of how adapter trimming is performed is provided in the "1a. Preprocessing" section of the pipeline description at: <BASEDIR>/v0.2.7/DESCRIPTION.txt. Additional description and usage information is available through the publicly available archive http://www.bcgsc.ca/platform/bioinfo/software/adapter-trimming-for-small-rna-sequencing
374-
- *Annotation:* Overview of miRNA analysis at the BCCA-GSC: &lt;BASEDIR&gt;/v0.2.7/DESCRIPTION.txt. Script usage detailed: &lt;BASEDIR&gt;/v0.2.7/code/annotation/HOWTO.txt
375-
- *Alignment stats:* &lt;BASEDIR&gt;/v0.2.7/library\_stats/README.txt
376-
- *Expression:* TCGA formatted miRNA expression reports: &lt;BASEDIR&gt;/v0.2.7/custom\_output/tcga/README.txt File formats for TCGA data files are also available online at: <https://wiki.nci.nih.gov/display/TCGA/miRNASeq
377-
- *miRNA expression maxtrices:* &lt;BASEDIR&gt;/v0.2.7/library\_stats/README.txt
378-
- *Optional graphs:* &lt;BASEDIR&gt;/v0.2.7/library\_stats/README.txt
373+
- *Pre-alignment processing:* A general description of how adapter trimming is performed is provided in the "1a. Preprocessing" section of the pipeline description at: <BASEDIR>/v0.2.8/DESCRIPTION.txt. Additional description and usage information is available through the publicly available archive http://www.bcgsc.ca/platform/bioinfo/software/adapter-trimming-for-small-rna-sequencing
374+
- *Annotation:* Overview of miRNA analysis at the BCCA-GSC: &lt;BASEDIR&gt;/v0.2.8/DESCRIPTION.txt. Script usage detailed: &lt;BASEDIR&gt;/v0.2.8/code/annotation/HOWTO.txt
375+
- *Alignment stats:* &lt;BASEDIR&gt;/v0.2.8/library\_stats/README.txt
376+
- *Expression:* TCGA formatted miRNA expression reports: &lt;BASEDIR&gt;/v0.2.8/custom\_output/tcga/README.txt File formats for TCGA data files are also available online at: <https://wiki.nci.nih.gov/display/TCGA/miRNASeq
377+
- *miRNA expression maxtrices:* &lt;BASEDIR&gt;/v0.2.8/library\_stats/README.txt
378+
- *Optional graphs:* &lt;BASEDIR&gt;/v0.2.8/library\_stats/README.txt
379379

380380
***Table 1.***
381381
Annotation priorities used to resolve multiple matches, from highest to lowest priority. If a read has more than 1 alignment, and the annotations are different, the priorities from Table 1 are used as long as only 1 alignment is to a miRNA. If there are multiple alignments to different miRNAs (or even different regions of the same miRNA), the read is flagged as cross-mapped and every miRNA annotation is preserved. Using these rules, the reads are summed by annotation, and coverage reports are generated.*
File renamed without changes.
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
miRNA Analysis Pipeline v0.2.7
1+
miRNA Analysis Pipeline v0.2.8
22

33
The TCGA miRNAseq data generation process, including strand-specific library construction, sequencing, and computational processing is described in:
44
Chu A, Robertson G, Brooks D, Mungall AJ, Birol I, Coope R, Ma Y, Jones S, Marra MA. Large-scale profiling of microRNAs for The Cancer Genome Atlas. Nucleic Acids Res. 2015 Aug 13.
File renamed without changes.

0 commit comments

Comments
 (0)