You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix opensearch regression making
heterozygotes:(4805 && 1805) cadd > 20
and
heterozygotes:(4805 1805) cadd > 20 (no &&) work the same
Previously, in Elasticsearch 5.6 (b10), these were equivalent
The text was updated successfully, but these errors were encountered:
akotlar
changed the title
Fix opensearch regression, not treating spaces as && within fields
Investigate opensearch regression, not treating spaces as && within fields
Nov 13, 2023
Fixed in https://github.com/bystrogenomics/bystro-web/pull/384 by creating a pre-processor for the query_string queries that transforms separate terms into parentheses-wrapped terms, which triggers elasticsearch/opensearch to search those terms individually, just as before. See the linked PR for more details. We also now have a small test suite to check that we are transforming things correctly, and the first set of transforms we check are:
consttestCases=[{input: "exonic pathogenic",expected: "(exonic) (pathogenic)"},{input: "(exonic pathogenic)",expected: "(exonic pathogenic)"},{input: 'refseq.name2:GAA',expected: '(refseq.name2:GAA)'},{input: 'refseq.name2:"GAA"',expected: '(refseq.name2:"GAA")'},{input: 'gene:"HELLO"',expected: '(gene:"HELLO")'},{input: '"Hello"',expected: '("Hello")'},{input: '+(chrom:chr17 pos:39580562)',expected: '+(chrom:chr17 pos:39580562)'},{input: 'exonic AND cadd:>20.2',expected: '(exonic) AND (cadd:>20.2)'},{input: '-(gene:BRCA1) OR +(gene:BRCA2)',expected: '-(gene:BRCA1) OR +(gene:BRCA2)'},{input: '*pathogenic*',expected: '(*pathogenic*)'},{input: 'BRCA1? AND BRCA2?',expected: '(BRCA1?) AND (BRCA2?)'}];
As seen above, terms that are already wrapped in parentheses are not affected. In this way we get the best of both worlds: by default queries behave as before, with the user being able to freely type queries like exonic pathogenic cadd > 20, while also now supporting synonyms that are phrases of multiple space separated terms, in which case we would now wrap those in parentheses (some long disease name), or if we want an exact match, in quote "some long disease name". I will add documentation for this.
Fix opensearch regression making
heterozygotes:(4805 && 1805) cadd > 20
and
heterozygotes:(4805 1805) cadd > 20 (no &&) work the same
Previously, in Elasticsearch 5.6 (b10), these were equivalent
The text was updated successfully, but these errors were encountered: