Skip to content

Validations

Darren J. Lin edited this page Mar 3, 2021 · 1 revision

General introductions

In this study, we have done intensive experimental and computational validations as well as manual inspections of Mako CSVs from HG00733 (609 events). The experimental validation was done through PCR and Sanger sequencing. The computational validation contained ONT reads based VaPoR (https://github.com/mills-lab/vapor) validation and HiFi contigs based K-mer validation. The ONT reads and the latest HiFi contigs were obtained from Human Genome Structural Variant Consortium (HGSVC).

Data source

HG00733 haploid HiFi assembly

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/HGSVC2/working/20200628_HHU_assembly-results_CCS_v12/assemblies/phased/

HG00733 Oxford Nanopore sequencing

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/hgsv_sv_discovery/working/20181210_ONT_rebasecalled/

VaPoR validation

Please refer to the VaPoR (https://github.com/mills-lab/vapor) GitHub repo.

Kmer validation workflow

kmer_workflow

Results

Experimental validation

Chrom Start End PacBio Type
chr1 81,194,398 81,195,874 invDup
chr2 119,659,504 119,661,322 insDup
chr3 146,667,093 146,677,284 delDisDup
chr5 141,480,327 141,483,116 disDup
chr7 1,940,931 1,941,009 insDup
chr9 29,591,409 29,593,057 delINV
chr10 14,568,488 14,568,677 insDup
chr12 71,315,482 71,316,928 invDup
chr12 77,989,900 77,994,324 invDup
chr13 74,340,759 74,342,810 disDup
chr16 78,004,459 78,007,456 disDup
chr17 34,854,438 34,855,851 invDup
chr17 48,538,270 48,540,171 disDup
chr18 72,044,575 72,045,937 disDup
chr21 26,001,844 26,002,990 delINV

All validation results

Strategy Valid
PCR and Sanger sequencing 68%
ONT reads 42%
HiFi contig 68%
ONT reads or HiFi contig 87%