-
Notifications
You must be signed in to change notification settings - Fork 0
Step 02 ‐ phenotype organisation
This step will organise the phenotype and covariate data ready for the GWAS analysis. It will create summaries and plots for each phenotype so that we can evaluate the distributional overlaps across cohorts. Efforts have been made to ensure that no individual-level will be shared, and no disclosive data will be included in the summary data that is shared (e.g. jitters are applied to plots etc). Notes:
- Check the
static_covariates="sex yob"
flag in theconfig.env
file. You may wish to add cohort specific covariates to your covariates file in which case they must be declared here also. The GWAS is performed in an age stratified manner so covariates that covary with age should not necessarily be a problem but please discuss with us if you're unsure. - See the Data-preparation and Phenotype-definitions pages to guide you as to how the phenotype and covariate data should be prepared.
- This ran in ~20 minutes on 450k samples (two phenotypes) on UK Biobank with
env_threads=100
using 10Gb RAM
./02-phenotype-organisation.sh
Please manually look at the $results_dir/phenotype_organisation.html
page to check that each phenotype is distributed as you would expect. Also monitor the samples that are being excluded and included and that they are as you expect.
This will generate the tarball and md5 sums for the results for this step:
./utils/archive.sh 02
./utils/upload.sh 02
(Note that this will prompt you for your SFTP password - contact us if you haven't received this.
Make sure you've pulled the latest image:
docker pull mrcieu/lifecourse-gwas:latest
and then run:
./utils/run_docker.sh ./02-phenotype-organisation.sh
Make sure you've pulled the latest image:
apptainer pull docker://mrcieu/lifecourse-gwas:latest
./utils/run_apptainer.sh ./02-phenotype-organisation.sh