Text Query Default Handling #2683

phorne-uncharted · 2021-06-03T14:08:46Z

Text queries match stemmed versions of words to aggregate similar words together. For example, singulars and plurals are reduced to their common form. However, stop words are not included in the stemmed word catalogue so the current text queries will simply drop them.

If the field is actually a text field, then that isn't too bad as it will simply reflect a word count excluding the stop words. On the other hand, if the field is a misclassified categorical field, then the facet may be presenting inaccurate information to the user. If one of the categories is a stop word (ex: a, and), then the facet will not be displaying it and the total count will not be accurate. This will make it harder for the user to understand what is going on. The attached csv file is a simple dataset example where one of the categories is a and when ingested, the field is a text field. Note the relevant facet will not display that category, making it look like only 15 rows exist when there are actually 20.
timestamp_empty.zip

The text queries should be updated to default to the empty string for words that do not match a stemmed version.

The text was updated successfully, but these errors were encountered:

phorne-uncharted added the enhancement label Jun 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text Query Default Handling #2683

Text Query Default Handling #2683

phorne-uncharted commented Jun 3, 2021

Text Query Default Handling #2683

Text Query Default Handling #2683

Comments

phorne-uncharted commented Jun 3, 2021