Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch sqlite exceptions when filter entities #6601

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

grzesiek2010
Copy link
Member

@grzesiek2010 grzesiek2010 commented Feb 11, 2025

Closes #6600
Closes #6597

Why is this the best possible solution? Were any other approaches considered?

For: #6600
When a filter applied to entities contains an expression referencing another question, we cannot build an optimized SQLite query. Instead, we need to fall back to Javarosa, which has all the necessary knowledge to evaluate such an expression. The optimized way of filtering can be used only if it uses entity properties available in the entity database.
For: #6597
At the moment of building an SQLite query to optimize filtering, both #6600 and #6597 appear to be the same issue (a missing column). However, when we fall back to Javarosa, #6600 is evaluated with the referenced question, while #6597 is still treated as a missing property.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

For: #6600
We need to test filtering with expressions that reference other questions in a form, not just expressions that use entity properties. The form attached to the issue is an example of it.
For: #6597
We need to test filtering with expressions that simply use non-existing properties.

Do we need any specific form for testing your changes? If so, please attach one.

The form is attached to the issue.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No.

Before submitting this PR, please make sure you have:

  • added or modified tests for any new or changed behavior
  • run ./gradlew connectedAndroidTest (or ./gradlew testLab) and confirmed all checks still pass
  • added a comment above any new strings describing it for translators
  • added any new strings with date formatting to DateFormatsTest
  • verified that any code or assets from external sources are properly credited in comments and/or in the about file.
  • verified that any new UI elements use theme colors. UI Components Style guidelines

@grzesiek2010 grzesiek2010 marked this pull request as ready for review February 13, 2025 05:02
@grzesiek2010 grzesiek2010 added the high priority Should be looked at before other PRs/issues label Feb 13, 2025
@seadowg seadowg self-requested a review February 13, 2025 11:18
Copy link
Member

@seadowg seadowg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to solve this at the DatabaseEntitiesRepository level instead by adding tests for these cases to/modifying #query?


val choices = scenario.choicesOf("/data/question").map { it.value }
assertThat(choices.isEmpty(), equalTo(true))
assertThat(fallthroughFilterStrategy.fellThrough, equalTo(true))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works correctly in the optimized way -- what makes it optimized? This looks like it fell through which I thought would be non-optimized.

This test looks identical to https://github.com/getodk/collect/pull/6601/files#diff-4a25b4c0ba0780186c2edabe26d8518d4e3e6a022a482669373e2a71ceac6dfbR585 to me

In this test not_existing_property is the reference in the secondary instance and 'value' is a static value.

In the test above, undefined is the reference in the secondary instance and /data/ref_question is a static value that came from the form.

It's confusing because the order is different and the intended meaning is different! But I believe that from the system's standpoint they should be in the same category. If there's code somewhere that treats them as different we should dig into it, I think.

Copy link
Member Author

@grzesiek2010 grzesiek2010 Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works correctly in the optimized way -- what makes it optimized? This looks like it fell through which I thought would be non-optimized.

I made a mistake in the title it's not optimized of course. I will fix that.

If you set not_existing_property='value', it will always return an empty list of entities (because not_existing_property does not exist).
However, if you use undefined=/data/ref_question or 'value'=/data/ref_question (because we focus too much on the undefined case), the list won't be empty - as long as the expected value matches the one entered in the question.
Wouldn't it be enough to treat these two cases as distinct and have separate tests for each?

Copy link
Member

@lognaturel lognaturel Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set not_existing_property='value', it will always return an empty list of entities (because not_existing_property does not exist).

And with not_existing_property='', you should always get all Entities. That's the same as the undefined=/data/ref_question case when /data/ref_question has a blank value.

I sort of see the distinction that you're making, thanks for explaining it again, that was helpful. It doesn't hurt to have a few extra tests and I don't think there's a fundamental misunderstanding here. 👍

'value'=/data/ref_question

This one does definitely feel different from the other two to me. This should be identified as not an expression that can use the optimized filter strategy and bypass before it even gets there (maybe good to double check this?). The other two should be identified as candidates for the filter strategy and then be kicked back to the next one once the column is determined to be unknown.

@grzesiek2010
Copy link
Member Author

Would it be possible to solve this at the DatabaseEntitiesRepository level instead by adding tests for these cases to/modifying #query?

We have to catch queries that contain non-existent columns and fall back to Javarosa so no there is nothing we can do just in the DatabaseEntitiesRepository

@lognaturel
Copy link
Member

there is nothing we can do just in the DatabaseEntitiesRepository

An alternative would be to "look before you leap" and fall back if there are any unknown properties in the expression, I think. I believe that's what @seadowg said the code did before.

@lognaturel
Copy link
Member

lognaturel commented Feb 14, 2025

I think one step even further would be to detect unknown columns in subexpressions, treat those as ‘’, and then evaluate the sub expression. That should always be possible for the shape of subexpression that makes it to the entities filter strategy.

For example, /data/ref=undefined would be detected as such an expression, it would be evaluated as /data/ref=‘’ and then the result (true or false) would be used in the SQL. That would avoid the need to fall back to another filter strategy. Not a must but hopefully the example further illustrates why I think of all the cases with unknown properties as being in the same category.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority Should be looked at before other PRs/issues
Projects
None yet
3 participants