-
Notifications
You must be signed in to change notification settings - Fork 739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hap.py benchmarking #896
Comments
Great question, always happy to help! Setting Is it possible for you to share the BAM and VCF/truth set used? One possibility is that the truth set doesn't match HG003. For example if you run happy for HG003 using HG002 truth set the accuracy would be around 0.4. |
That particular region in HLA is likely to have multiple different alleles mapped to it due to the diversity of structural variation in the DRB loci. Hg38 (no_alts) only has DRB5 in chr6. If your sample has DRB1 alleles that are linked to DRB3/4, these reads often map here. (see fig 4) https://pmc.ncbi.nlm.nih.gov/articles/PMC10849470/ |
Hi Lucas and John, Thank you so much for your input. I’m currently trying to figure out what’s happening… with different references..! Kind regards, |
Hey, sorry to reopen the issue. I called variants against HLA class II genes region for HG003 using HG003_GRCh38_1_22_v4.2.1_benchmark_noinconsistent.bed and have very low precision values :( probably I am doing wrong variant calling in the HLA... sorry for dumb questions, I am trying to figure this out on my own (and with AI) |
You need to provide the HLA bed with -T option in hap.py cause it seems like it's comparing against everything. Can you paste your hap.py command here? |
Hello,
Sorry for writing again about some questions in the analysis, I don't have any expertise on DeepVariant/variant benchmarking around me :(
I wanted to benchmark identified variants using hap.py as you suggest in the tutorial, and the output surprised me a bit. For the variants I called the recall/precision were from 0 to 0.4, which is very low... I did not change default pacbio parameters when running DeepVariant, except for asking to keep supplementary alignments. Maybe it is partially because of the complexity of HLA region, but I am not sure what is the reason. What do you think, is there something obviously wrong?
I am using version 1.6.1.
Kind regards,
Alisa
The text was updated successfully, but these errors were encountered: