API: Consider Support and Documentation for more complex AND/OR queries #506

mrshll1001 · 2024-05-15T09:28:21Z

This stems from @apricot13's post on the OR Forum:

https://forum.openreferral.org/t/clarification-around-and-or-logic-for-filters-and-taxonomies/

At the moment the API supports filters on several endpoints, for several parameters, which behave in a standardised way for filters i.e. they are cumulative and will only permit results through if they match one of the criteria set out as a filter.

This behaviour has the following effects:

Multiple filters on different parameters effectively create a boolean AND query. The query /services?taxonomy_id=XXX&organization_id=YYY will only return results that have a taxonomy_id of XXX and an organization_id of YYY
Multiple filters on the same parameter create a boolean OR query on that parameter. The query /services?taxonomy_id=XXX&taxonomy_id=YYY will only return results that have a taxonomy_id of XXX, YYY, or two taxonomy ids.

This is well and dandy, however does not cover (at least) two important use-cases:

what if we only want results that contain both of the taxonomy ids stated, and aren't interested in the others?
What if we want results that match at least one of sets of desirable properties which may be mutually exclusive (i.e. (A AND B AND C) OR (D AND E AND F))

I think that the former is more immediate.

There is also the question of whose problem is it to address? From a HTTP and REST perspective, I think it can be argued that the HSDS API spec is already doing its job by providing the filters. This isn't a great attitude to have wrt supporting people implementing the spec or finding services, though.

The first problem, if we ONLY want results that contain BOTH of the taxonomy_ids stated, very well might be a sorting problem rather than a querying or filter problem. If we've got a set results which we know contains:

items with a taxonomy_id of XXX (and perhaps other taxonomy_ids not explicitly filtered for)
items with a taxonomy_id of YYY (and perhaps other taxonomy_ids not explicitly filtered for)
items with a taxonomy_id of XXX AND YYY (and perhaps other taxonomy_ids not explicitly filtered for)

Then it may be suitable to just sort the results according to how many of the filter parameters each item meets? I think there could be a productive discussion around whether it's the API implementation or the consuming application that does the sorting.

We could address the issue of more complex querying more explicitly by providing a query endpoint — or a query parameter on other endpoints — which takes a query language for the data. For example "Lucene" syntax or ElasticSearch query strings; although that may be difficult for people using other technologies to parse out and translate to a query for their own database systems.

The text was updated successfully, but these errors were encountered:

dan-odsc added discussion schema labels May 16, 2024

dan-odsc self-assigned this May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: Consider Support and Documentation for more complex AND/OR queries #506

API: Consider Support and Documentation for more complex AND/OR queries #506

mrshll1001 commented May 15, 2024

API: Consider Support and Documentation for more complex AND/OR queries #506

API: Consider Support and Documentation for more complex AND/OR queries #506

Comments

mrshll1001 commented May 15, 2024