-
Notifications
You must be signed in to change notification settings - Fork 7
Building reference panel
The weights have to be generated in general. The software TWAS contains two command files:
-
TWAS_get_weights.sh
, which obtains weights (.ld, .cor, .map) from PLINK map/ped pair given a particular locus. It actually wraps up a program in R. -
TWAS.sh
, which conducts imputatation as reported in the Gusev et al. (2016).
Minor changes to the scripts may be required for your own data. The tasks involved are to
-
extract SNPs in a gene from 1000Genomes imputed data into PLINK map/ped files
-
obtain .ld, .cor and .map with
TWAS_get_weights.sh
for that gene -
select summary statistics (.zscore) for the gene
-
conduct imputation with
TWAS.sh
into file .imp -
repeat above steps for all genes and collect results
From UCSC, you obtain the gene bounaries as follows,
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e 'select * from refGene' > refGene.txt
However, it is often necessary to define a region using a list of SNPs. In this regard, tables such as snp146
in hg19
above are needed. From locuszoom-1.3
(Pruim, et al. 2010) we can extract refFlat.txt
and snp_pos.txt
(see lz.sql
) to build a list of SNP-gene pairs, as with (UK BioBank Axiom chip) Axiom_UKB_WCSG.na34.annot.csv.zip
. Their chromosome-specific counterparts as with SNPs under all genes can also be derived. A Stata
program lz.do
which calls refGene.do
is developed in collaboration with Dr Jian'an Luan to faciliate handling of gene boundaries.
Reference
Pruim RJ, et al. (2010). LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics, 26,2336-2337