You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current plan is to run the Ashkenazim trio (ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/) on an AWS cluster. This consists of three 2x250 Illumina samples. GIAB also provides high confidence variant calls for each sample for benchmarking. We plan to genotype each sample individually and filter variants using hard filters. The goal is to test a configuration that is similar to the one the ADAM/GATK comparison will use. It was brought up that testing three samples may not be sufficient. We can of course find more samples to run, but it will obviously cost more to run more samples. How many samples are you planning to run for the ADAM/GATK comparison?
10 for the head to head, 260 for ADAM only. I would probably go with a larger dataset than a trio; the Illumina Platinum Pedigree (http://www.illumina.com/platinumgenomes/) is something I would run through. Essentially, I'd like to push at least 1TB of data through the pipeline.
@fnothaft has concerns (upon which he will elaborate) regarding the testing process.
The text was updated successfully, but these errors were encountered: