-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.txt
50 lines (43 loc) · 2.14 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
RNASeq023 - HSCC after chronic 1-month EtOH
The analysis related files are distributed amond 3 directories:
data: input data and intermediate files
data/Alignment_mm10 --> bam and bai files
data/Normalization --> all input files needed to generate normalized read counts
scripts: all scripts used for analysis
analysis: final analysis results files
Alignment:
Since Sample file numbers had some breaks in sequence, I used 3 scripts to run the alignment simultaneously for different batches of samples.
In lawrence:
nohup ./run_star_alignment_1.sh > ./run_star_alignment_1.log 2>&1 </dev/null &
nohup ./run_star_alignment_1b.sh > ./run_star_alignment_1b.log 2>&1 </dev/null &
nohup ./sort_and_index_1b.sh > ./sort_and_index_1b.log 2>&1 </dev/null &
In exacloud:
condor-submit align.sub
align.sub:
executable = run_star_alignment_2.sh
log = submit_run_star_alignment.log
output = submit_run_star_alignment.out
error = submit_run_star_alignment.err
request_cpus = 20
request_memory = 64 GB
notification =Complete
notify_user [email protected]
queue
Counts:
nohup ../../scripts/run_bedtools_coverage.sh > ./run_bedtools_coverage.log 2>&1 </dev/null &
mv RNASeq023_mm10_coverage_splitoption.txt ../../analysis/
ls -d Sample_RNA150217RH_* > samples1.txt
cd ../150423_D00735_0035_BC7D48ACXX/
ls -d Sample_RNA150217RH_* >> ../150423_D00735_0034_AC791EACXX/samples1.txt
cd ../150423_D00735_0034_AC791EACXX/
#### After removing unnecessary substrings from sample names and tab separating core id from lab id:
mv samples1.txt sample_key.txt
mv sample_key.txt ../data/
cd ../data/
sort -n sample_key.txt > sample_key_sorted
mv sample_key_sorted sample_key_sorted.txt
cd Alignment_mm10/
ls Alignment_mm10/*sorted*bam > bam_files_counts_header_order.txt
#### Removed unnecessary substrings from sample names and tab separating core id from lab id
Normalization:
used --> scripts/selectNormalizeGeneExonCounts_RNASeq023.R