Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GATK Pipeline Validation Testing #470

Open
alex-hancock opened this issue Oct 11, 2016 · 3 comments
Open

GATK Pipeline Validation Testing #470

alex-hancock opened this issue Oct 11, 2016 · 3 comments
Assignees

Comments

@alex-hancock
Copy link
Member

@fnothaft has concerns (upon which he will elaborate) regarding the testing process.

@jpfeil
Copy link
Contributor

jpfeil commented Oct 17, 2016

The current plan is to run the Ashkenazim trio (ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/) on an AWS cluster. This consists of three 2x250 Illumina samples. GIAB also provides high confidence variant calls for each sample for benchmarking. We plan to genotype each sample individually and filter variants using hard filters. The goal is to test a configuration that is similar to the one the ADAM/GATK comparison will use. It was brought up that testing three samples may not be sufficient. We can of course find more samples to run, but it will obviously cost more to run more samples. How many samples are you planning to run for the ADAM/GATK comparison?

@fnothaft
Copy link
Contributor

10 for the head to head, 260 for ADAM only. I would probably go with a larger dataset than a trio; the Illumina Platinum Pedigree (http://www.illumina.com/platinumgenomes/) is something I would run through. Essentially, I'd like to push at least 1TB of data through the pipeline.

@jpfeil
Copy link
Contributor

jpfeil commented Oct 17, 2016

Okay! That sounds good. Alex and I will kick off a run with the Platinum Pedigree samples ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants