Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while running test-analysis-exome.yml #589

Open
maojn opened this issue Jan 30, 2025 · 2 comments
Open

Error while running test-analysis-exome.yml #589

maojn opened this issue Jan 30, 2025 · 2 comments

Comments

@maojn
Copy link

maojn commented Jan 30, 2025

Hi,
Thank you for providing this useful software for the scientific community.
It's been installed on our HPC systems in NIH for years. I am in the process of updating to version 14.1.0 and ran into the following error.

The test-analysis-exome.yml in examples directory was modified from hg19 to hg38 and nothing else was changed.
$ ls -l /.../data/2406_hg38/*
-rwxrwxr-x Jun 13 2024 /fdb/exomiser/data/2406_hg38/2406_hg38_clinvar.mv.db*
-rwxrwxr-x Jun 10 2024 /fdb/exomiser/data/2406_hg38/2406_hg38_genome.mv.db*
-rwxrwxr-x Jun 10 2024 /fdb/exomiser/data/2406_hg38/2406_hg38_transcripts_ensembl.ser*
-rwxrwxr-x Jun 10 2024 /fdb/exomiser/data/2406_hg38/2406_hg38_transcripts_refseq.ser*
-rwxrwxr-x 74M Jun 10 2024 /fdb/exomiser/data/2406_hg38/2406_hg38_transcripts_ucsc.ser*
-rwxrwxr-x 30G Jun 10 2024 /fdb/exomiser/data/2406_hg38/2406_hg38_variants.mv.db*

$ ls -l /..../data/cadd/1.7/hg38/
total 83G
-rw-r--r-- 1 1.2G Jan 11 2024 gnomad.genomes.r4.0.indel.tsv.gz
-rw-r--r-- 1 1.9M Jan 11 2024 gnomad.genomes.r4.0.indel.tsv.gz.tbi
-rw-r--r-- 1 82G Jan 29 2024 whole_genome_SNVs.tsv.gz
-rw-r--r-- 1 2.7M Jan 29 2024 whole_genome_SNVs.tsv.gz.tbi

$ ls -l /..../data/remm/0.3.1.post1/
total 28G
-rw-r--r-- 1 2G Jan 29 11:34 ReMM.v0.3.1.post1.hg19.tsv.gz
-rw-r--r-- 1 2.4M Jan 29 11:34 ReMM.v0.3.1.post1.hg19.tsv.gz.tbi
-rw-r--r-- 1 17G Jan 29 11:42 ReMM.v0.3.1.post1.hg38.tsv.gz
-rw-r--r-- 1 2.7M Jan 29 11:42 ReMM.v0.3.1.post1.hg38.tsv.gz.tbi
-rw-r--r-- 1 264 Jan 29 11:42 ReMM.v0.3.1.post1.md5

$ cat applications.properties
exomiser.data-directory=/fdb/exomiser/data/
remm.version=0.3.1.post1
cadd.version=1.7
#### hg38 assembly
exomiser.hg38.data-version=2406
exomiser.hg38.cadd-snv-path=${exomiser.data-directory}/cadd/${cadd.version}/hg38/whole_genome_SNVs.tsv.gz
exomiser.hg38.cadd-in-del-path=${exomiser.data-directory}/cadd/${cadd.version}/hg38/gnomad.genomes.r4.0.indel.tsv.gz
exomiser.hg38.remm-path=${exomiser.data-directory}/remm/${remm.version}/ReMM.v${remm.version}.hg38.tsv.gz
exomiser.hg38.clinvar-data-version=2406
### phenotypes ###
exomiser.phenotype.data-version=2406
exomiser.phenotype.data-directory=${exomiser.data-directory}/${exomiser.phenotype.data-version}_phenotype
exomiser.phenotype.random-walk-file-name=rw_string_10.mv
### caching ###
spring.cache.type=caffeine
spring.cache.caffeine.spec=maximumSize=480000
### logging ###
logging.file-output
logging.level.com.zaxxer.hikari=ERROR

Then I ran:
$ java -Xms2g -Xmx48g -jar exomiser-cli-14.1.0.jar --analysis examples/test-analysis-exome.yml

Welcome to:


|_ | |_ ___ | | _____ _ __ ___ ()_ ___ _ __
| | | '_ \ / _ \ | | \ / / _ | ' ` _ | / |/ _ \ '|
| | | | | | / | |_ > < () | | | | | | _ \ / |
|| || |_|_
| |/_/__/|| || |||/___|_|

A Tool to Annotate and Prioritize Exome Variants v14.1.0

2025-01-30T15:50:43.284-05:00 INFO 138646 --- [ main] o.monarchinitiative.exomiser.cli.Main : Starting Main using Java 17.0.3.1 with PID 138646 (/usr/local/apps/exomiser/14.1.0/exomiser-cli-14.1.0.jar started by maoj in /usr/local/apps/exomiser/14.1.0)
2025-01-30T15:50:43.287-05:00 INFO 138646 --- [ main] o.monarchinitiative.exomiser.cli.Main : No active profile set, falling back to 1 default profile: "default"
2025-01-30T15:50:43.769-05:00 INFO 138646 --- [ main] o.m.exomiser.cli.config.MainConfig : Exomiser home: /usr/local/apps/exomiser/14.1.0
2025-01-30T15:50:43.777-05:00 INFO 138646 --- [ main] o.m.exomiser.cli.config.MainConfig : Root data source directory set to: /fdb/exomiser/data
2025-01-30T15:50:43.779-05:00 INFO 138646 --- [ main] o.m.e.c.g.j.JannovarDataProtoSerialiser : Deserialising Jannovar data from /fdb/exomiser/data/2406_hg38/2406_hg38_transcripts_ensembl.ser
2025-01-30T15:50:44.179-05:00 INFO 138646 --- [ main] o.m.e.c.g.j.JannovarDataProtoSerialiser : Deserialisation took 0.398 sec.
2025-01-30T15:50:45.814-05:00 INFO 138646 --- [ main] o.m.e.c.g.dao.ClinVarWhiteListReader : Reading ClinVar whitelist...
2025-01-30T15:50:45.820-05:00 INFO 138646 --- [ main] o.m.e.c.g.dao.serialisers.MvStoreUtil : MVMap 'clinvar' opened with 2897469 entries
2025-01-30T15:50:47.796-05:00 INFO 138646 --- [ main] o.m.e.c.g.dao.ClinVarWhiteListReader : Read 238630 ClinVar whitelist variants in 1981 ms
2025-01-30T15:50:47.820-05:00 INFO 138646 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Loaded 238630 whitelist variants
2025-01-30T15:50:47.870-05:00 INFO 138646 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening CADD snv data from source: /fdb/exomiser/data/cadd/1.7/hg38/whole_genome_SNVs.tsv.gz
2025-01-30T15:50:47.927-05:00 INFO 138646 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening CADD InDel data from source: /fdb/exomiser/data/cadd/1.7/hg38/gnomad.genomes.r4.0.indel.tsv.gz
2025-01-30T15:50:47.956-05:00 INFO 138646 --- [ main] o.m.e.a.genome.GenomeDataSourceLoader : Opening REMM data from source: /fdb/exomiser/data/remm/0.3.1.post1/ReMM.v0.3.1.post1.hg38.tsv.gz
2025-01-30T15:50:48.076-05:00 INFO 138646 --- [ main] o.m.e.c.g.dao.serialisers.MvStoreUtil : MVMap 'alleles' opened with 816467174 entries
2025-01-30T15:50:48.092-05:00 INFO 138646 --- [ main] o.m.e.c.g.dao.serialisers.MvStoreUtil : MVMap 'clinvar' opened with 2897469 entries
2025-01-30T15:50:49.595-05:00 INFO 138646 --- [ main] o.m.e.core.genome.dao.ClinVarDaoMvStore : Created 18551 ClinVar gene stats in 1502 ms
2025-01-30T15:50:49.932-05:00 INFO 138646 --- [ main] g.GenomeAnalysisServiceAutoConfiguration : Configured hg38 genome analysis service
2025-01-30T15:50:53.242-05:00 INFO 138646 --- [ main] o.m.exomiser.cli.config.MainConfig : Default results directory set to: /usr/local/apps/exomiser/14.1.0/results
2025-01-30T15:50:53.259-05:00 INFO 138646 --- [ main] o.m.e.a.ExomiserConfigReporter : exomiser.data-directory: /fdb/exomiser/data/
2025-01-30T15:50:53.259-05:00 INFO 138646 --- [ main] o.m.e.a.ExomiserConfigReporter : exomiser.hg19.data-version: -
2025-01-30T15:50:53.259-05:00 INFO 138646 --- [ main] o.m.e.a.ExomiserConfigReporter : exomiser.hg38.data-version: 2406
2025-01-30T15:50:53.259-05:00 INFO 138646 --- [ main] o.m.e.a.ExomiserConfigReporter : exomiser.hg38.clinvar-data-version: 2406
2025-01-30T15:50:53.259-05:00 INFO 138646 --- [ main] o.m.e.a.ExomiserConfigReporter : exomiser.phenotype.data-version: 2406
2025-01-30T15:50:53.259-05:00 INFO 138646 --- [ main] o.m.e.a.ExomiserConfigReporter : spring.cache.type: caffeine
2025-01-30T15:50:53.403-05:00 INFO 138646 --- [ main] o.monarchinitiative.exomiser.cli.Main : Started Main in 10.326 seconds (process running for 10.521)
2025-01-30T15:50:53.544-05:00 INFO 138646 --- [ main] o.m.e.cli.ExomiserCommandLineRunner : Exomiser running...
2025-01-30T15:50:53.554-05:00 INFO 138646 --- [ main] o.m.exomiser.core.Exomiser : Running analysis using hg38 assembly with mode: PASS_ONLY
2025-01-30T15:50:53.557-05:00 INFO 138646 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Validating sample input data
2025-01-30T15:50:53.588-05:00 INFO 138646 --- [ main] o.m.e.core.model.SampleIdentifiers : Proband sample name not specified - using sample name 'manuel' from VCF
2025-01-30T15:50:53.588-05:00 INFO 138646 --- [ main] o.m.e.c.a.s.PedigreeSampleValidator : No pedigree provided for sample 'manuel'
2025-01-30T15:50:53.588-05:00 INFO 138646 --- [ main] o.m.e.c.a.s.PedigreeSampleValidator : Creating single-sample pedigree for manuel
2025-01-30T15:50:53.593-05:00 INFO 138646 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Running analysis for proband manuel (sample 1 in VCF) from samples: [manuel]. Using coordinates for genome assembly hg38.
2025-01-30T15:50:53.714-05:00 INFO 138646 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Filtering variants with:
2025-01-30T15:50:53.715-05:00 INFO 138646 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : FailedVariantFilter{}
2025-01-30T15:50:53.715-05:00 INFO 138646 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : VariantEffectFilter{offTargetVariantTypes=[CODING_TRANSCRIPT_INTRON_VARIANT, FIVE_PRIME_UTR_EXON_VARIANT, THREE_PRIME_UTR_EXON_VARIANT, FIVE_PRIME_UTR_INTRON_VARIANT, THREE_PRIME_UTR_INTRON_VARIANT, NON_CODING_TRANSCRIPT_EXON_VARIANT, NON_CODING_TRANSCRIPT_INTRON_VARIANT, UPSTREAM_GENE_VARIANT, DOWNSTREAM_GENE_VARIANT, INTERGENIC_VARIANT, REGULATORY_REGION_VARIANT]}
2025-01-30T15:50:53.715-05:00 INFO 138646 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : FrequencyFilter{maxFreq=2.0}
2025-01-30T15:50:53.716-05:00 INFO 138646 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Wrapping FrequencyFilter{maxFreq=2.0} with VariantDataProvider for sources [UK10K, GNOMAD_E_AFR, GNOMAD_E_AMR, GNOMAD_E_EAS, GNOMAD_E_NFE, GNOMAD_E_SAS, GNOMAD_G_AFR, GNOMAD_G_AMR, GNOMAD_G_EAS, GNOMAD_G_NFE, GNOMAD_G_SAS]
2025-01-30T15:50:53.716-05:00 INFO 138646 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : PathogenicityFilter{keepNonPathogenic=true}
2025-01-30T15:50:53.717-05:00 INFO 138646 --- [ main] o.m.e.c.analysis.AbstractAnalysisRunner : Wrapping PathogenicityFilter{keepNonPathogenic=true} with VariantDataProvider for sources [REVEL, MVP]
2025-01-30T15:50:53.717-05:00 INFO 138646 --- [ main] o.m.e.core.genome.VariantFactoryImpl : Annotating variant records, trimming sequences and normalising positions...
2025-01-30T15:50:54.404-05:00 INFO 138646 --- [ main] o.m.e.core.genome.VariantFactoryImpl : Processed 3706 variant records into 3705 single allele variants (including 0 structural variants)
2025-01-30T15:50:54.404-05:00 INFO 138646 --- [ main] o.m.e.core.genome.VariantFactoryImpl : Variant annotation finished in 0m 0s 686ms (686 ms)
2025-01-30T15:50:54.406-05:00 INFO 138646 --- [ main] .s.b.a.l.ConditionEvaluationReportLogger :

Error starting ApplicationContext. To display the condition evaluation report re-run your application with 'debug' enabled.
2025-01-30T15:50:54.416-05:00 ERROR 138646 --- [ main] o.s.boot.SpringApplication : Application run failed

org.monarchinitiative.svart.CoordinatesOutOfBoundsException: One-based coordinates 1:249149660-249149661 out of contig bounds [1,248956422]
at org.monarchinitiative.svart.GenomicInterval.validateCoordinatesOnContig(GenomicInterval.java:48) ~[svart-2.0.0-RC6.jar:na]
at org.monarchinitiative.svart.BaseGenomicRegion.(BaseGenomicRegion.java:19) ~[svart-2.0.0-RC6.jar:na]
at org.monarchinitiative.svart.BaseGenomicVariant.(BaseGenomicVariant.java:23) ~[svart-2.0.0-RC6.jar:na]
at org.monarchinitiative.svart.impl.DefaultGenomicVariant.(DefaultGenomicVariant.java:8) ~[svart-2.0.0-RC6.jar:na]
at org.monarchinitiative.svart.impl.DefaultGenomicVariant.of(DefaultGenomicVariant.java:31) ~[svart-2.0.0-RC6.jar:na]
at org.monarchinitiative.svart.GenomicVariant.of(GenomicVariant.java:144) ~[svart-2.0.0-RC6.jar:na]
at org.monarchinitiative.svart.util.VcfConverter.convert(VcfConverter.java:75) ~[svart-2.0.0-RC6.jar:na]
at org.monarchinitiative.exomiser.core.genome.VariantContextConverter.convertToVariant(VariantContextConverter.java:113) ~[exomiser-core-14.1.0.jar:na]
at org.monarchinitiative.exomiser.core.genome.VariantFactoryImpl.buildVariantEvaluations(VariantFactoryImpl.java:161) ~[exomiser-core-14.1.0.jar:na]
at org.monarchinitiative.exomiser.core.genome.VariantFactoryImpl.lambda$buildAlleleVariantEvaluations$1(VariantFactoryImpl.java:110) ~[exomiser-core-14.1.0.jar:na]
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) ~[na:na]
at java.base/java.util.ArrayList$SubList$2.forEachRemaining(ArrayList.java:1481) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[na:na]
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[na:na]
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:na]
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) ~[na:na]
at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:276) ~[na:na]
at java.base/java.util.stream.ReferencePipeline$15$1.accept(ReferencePipeline.java:541) ~[na:na]
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133) ~[na:na]
at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1845) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:575) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260) ~[na:na]
at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:616) ~[na:na]
at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:622) ~[na:na]
at java.base/java.util.stream.ReferencePipeline.toList(ReferencePipeline.java:627) ~[na:na]
at org.monarchinitiative.exomiser.core.analysis.AbstractAnalysisRunner.loadAndFilterVariants(AbstractAnalysisRunner.java:235) ~[exomiser-core-14.1.0.jar:na]
at org.monarchinitiative.exomiser.core.analysis.AbstractAnalysisRunner.run(AbstractAnalysisRunner.java:112) ~[exomiser-core-14.1.0.jar:na]
at org.monarchinitiative.exomiser.core.Exomiser.run(Exomiser.java:83) ~[exomiser-core-14.1.0.jar:na]
at org.monarchinitiative.exomiser.core.Exomiser.run(Exomiser.java:69) ~[exomiser-core-14.1.0.jar:na]
at org.monarchinitiative.exomiser.cli.ExomiserCommandLineRunner.runJob(ExomiserCommandLineRunner.java:90) ~[exomiser-cli-14.1.0.jar:na]
at org.monarchinitiative.exomiser.cli.ExomiserCommandLineRunner.runJobs(ExomiserCommandLineRunner.java:65) ~[exomiser-cli-14.1.0.jar:na]
at org.monarchinitiative.exomiser.cli.ExomiserCommandLineRunner.run(ExomiserCommandLineRunner.java:60) ~[exomiser-cli-14.1.0.jar:na]
at org.springframework.boot.SpringApplication.lambda$callRunner$5(SpringApplication.java:790) ~[spring-boot-3.2.3.jar:3.2.3]
at org.springframework.util.function.ThrowingConsumer$1.acceptWithException(ThrowingConsumer.java:83) ~[spring-core-6.1.4.jar:6.1.4]
at org.springframework.util.function.ThrowingConsumer.accept(ThrowingConsumer.java:60) ~[spring-core-6.1.4.jar:6.1.4]
at org.springframework.util.function.ThrowingConsumer$1.accept(ThrowingConsumer.java:88) ~[spring-core-6.1.4.jar:6.1.4]
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:798) ~[spring-boot-3.2.3.jar:3.2.3]
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:789) ~[spring-boot-3.2.3.jar:3.2.3]
at org.springframework.boot.SpringApplication.lambda$callRunners$3(SpringApplication.java:774) ~[spring-boot-3.2.3.jar:3.2.3]
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[na:na]
at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:510) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[na:na]
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[na:na]
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[na:na]
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:na]
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) ~[na:na]
at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:774) ~[spring-boot-3.2.3.jar:3.2.3]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:341) ~[spring-boot-3.2.3.jar:3.2.3]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1354) ~[spring-boot-3.2.3.jar:3.2.3]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343) ~[spring-boot-3.2.3.jar:3.2.3]
at org.monarchinitiative.exomiser.cli.Main.main(Main.java:53) ~[exomiser-cli-14.1.0.jar:na]

Did I miss something? Thank you very much for your help.

@julesjacobsen
Copy link
Contributor

Have you checked that you're analysing the VCF using the correct genome build? You have Exomiser set up to work on hg38 but it threw an error with the cause:

2025-01-30T15:50:54.416-05:00 ERROR 138646 --- [ main] o.s.boot.SpringApplication : Application run failed

org.monarchinitiative.svart.CoordinatesOutOfBoundsException: One-based coordinates 1:249149660-249149661 out of contig bounds [1,248956422]

Given the lengths of chromosome 1 are defined like so for GRCh37 and GRCh38:

# Assembly name:  GRCh37.p13
# RefSeq assembly accession: GCF_000001405.25
# Sequence-Name	Sequence-Role	Assigned-Molecule	Assigned-Molecule-Location/Type	GenBank-Accn	Relationship	RefSeq-Accn	Assembly-Unit	Sequence-Length	UCSC-style-name
1	assembled-molecule	1	Chromosome	CM000663.1	=	NC_000001.10	Primary Assembly	249250621	chr1
# Assembly name:  GRCh38.p13
# RefSeq assembly accession: GCF_000001405.39
# Sequence-Name	Sequence-Role	Assigned-Molecule	Assigned-Molecule-Location/Type	GenBank-Accn	Relationship	RefSeq-Accn	Assembly-Unit	Sequence-Length	UCSC-style-name
1	assembled-molecule	1	Chromosome	CM000663.2	=	NC_000001.11	Primary Assembly	248956422	chr1

It suggests you are trying to analyse a VCF called on GRCh37 using data from GRCh38. This will not produce good results...

Double-check your input data and Exomiser setup data are for the same assembly. In this case your examples/test-analysis-exome.yml will be pointing to a VCF file which was called using GRCh37, so you 'll need to point to one which uses GRCh38 coordinates. The attached file is the Pfeiffer.vcf lifted over to GRCh38 coordinates and once you update your test-analysis-exome.yml to point to this, it should complete.

Pfeiffer-hg38.vcf.gz

@maojn
Copy link
Author

maojn commented Feb 4, 2025

Thank you very much. That solved the problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants