-
Notifications
You must be signed in to change notification settings - Fork 0
Step 01 ‐ ancestry
gibran hemani edited this page Jan 14, 2025
·
8 revisions
This step will generate principal components if required. Notes:
- Ensure that
env_family_data="true"
in theconfig.env
file is set if the data has substantial relatedness. This will ensure that the relatedness is handled appropriately. If it is set to"false"
then this step will identify a subset of ~unrelated individuals for the main analyses. - The number of PCs that will be used by default is 10, but set the
config.env
file to the appropriate number for your study. - The pipeline expects the analysis to be run on a single ancestral group. If your cohort has substantial numbers of samples across multiple ancestral groups please ensure that you run the pipeline separately for each of those major ancestral groups. Some cohorts may be comprised of largely admixed individuals. In general we suggest treating such cohorts as a single group, but we would defer to your experience of how best to handle your cohort.
- If you have pre-calculated PCs for the samples in these data you are welcome to provide them. Please store them in
$genotype_processed_dir/pcs.txt
, with columnsFID, IID, PC1, PC2, ...
. - This ran in ~2 hours on 450k samples in UKBiobank using
env_threads=100
and up to 32Gb RAM
To run:
./01-ancestry.sh
If necessary the script can be resumed at different steps e.g. related
, pcs
, grm
, keeplists
. i.e.
./01-ancestry.sh pcs
will run from the pcs
step to the end of the file.
Please manually look at the $results_dir/pcaplot.png
file to check that ancestral clustering looks as expected before uploading the results.
This will generate the tarball and md5 sums for the results for this step:
./utils/archive.sh 01
./utils/upload.sh 01
(Note that this will prompt you for your SFTP password - contact us if you haven't received this.
Make sure you've pulled the latest image:
docker pull mrcieu/lifecourse-gwas:latest
and then run:
./utils/run_docker.sh ./01-ancestry.sh
Make sure you've pulled the latest image:
apptainer pull docker://mrcieu/lifecourse-gwas:latest
./utils/run_apptainer.sh ./01-ancestry.sh