Wire non-OpenSearch backends through the database registry#1065
Conversation
PR opensearch-project#1023 introduced the DatabaseClient abstraction and registry but left several code paths hardcoded to the OpenSearch client. Surface each of these so a non-OpenSearch backend (registered via the existing factory + registry) can run a benchmark end-to-end. - benchmark.py: add `--database-type` to `osb run`. Choices derive from the DatabaseType enum so new backends become selectable simply by registering. Sets `cfg.opts("database", "type")` which the worker coordinator already reads. - test_run_orchestrator.py: skip the OS distribution-version probe (which builds a raw opensearchpy client and walks OS-shaped health/info routes) when database.type != opensearch. - worker_coordinator.py: WorkerCoordinatorActor.create_os_clients routes through DatabaseClientFactory.create_client_factory for non-OpenSearch backends, mirroring the async path at WorkerCoordinator.os_clients. wait_for_rest_api delegates the readiness probe to a factory-provided wait_for_rest_layer when database.type != opensearch; the legacy probe is preserved for OpenSearch. OpenSearch behaviour is preserved. Closes opensearch-project#1064 Signed-off-by: zznate <zznate.m@gmail.com>
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
|
thanks for this @zznate ! Just looking at a high level I think I planned to introduce something similar in #1043 along with the introduction of Vespa support, but that PR is still pending review. This LGTM though and solves the hardcoding issue immediately so I think we can just go ahead and merge :) |
| if database_type.lower() != "opensearch": | ||
| hosts = self.config.opts("client", "hosts").all_hosts["default"] | ||
| client_options = dict(self.config.opts("client", "options").all_client_options["default"]) | ||
| db_factory = DatabaseClientFactory.create_client_factory(database_type, hosts, client_options) |
There was a problem hiding this comment.
actually one NIT i have is that DatabaseClientFactory seems to get instantiated twice, once here and once up above on line 991, maybe we can just clean that up?
There was a problem hiding this comment.
Rad - thanks @OVI3D0
Good catch - i'll do that real quick.
There was a problem hiding this comment.
@OVI3D0 Any interest in a Quickwit implementation? I can clean it up a bit and submit with a new issue if so.
Caveat: there is a lot of surface area that doesnt correlate, so we would need to plan out a consistent way of handling this. That said, might be a good forcing function for doing this right before other impls start to arrive?
There was a problem hiding this comment.
Any interest in a Quickwit implementation?
yeah for sure @zznate ! the hope with this project was to have OSB branch out from just being a benchmarking tool for OpenSearch and have it be used for any compatible search engine :) feel free to open up an issue or even contribute the changes and we can get those in
we would need to plan out a consistent way of handling this
There was a larger plan for this written out here: #998
Here you can read about the proposed architecture for future search engines we had in mind
There was a problem hiding this comment.
Perferct - thanks! I'll give that a read. Missed that in my initial pass.
There was a problem hiding this comment.
@OVI3D0 - do i need to do anything else to get this merged? I see it's blocked still, please let me know if you need anything from my end.
There was a problem hiding this comment.
@zznate unfortunately code changes across the org are locked down until this is resolved: https://www.stepsecurity.io/blog/mini-shai-hulud-is-back-a-self-spreading-supply-chain-attack-hits-the-npm-ecosystem
:p
There was a problem hiding this comment.
Oh gross! Understood tho - thanks!
Addresses review feedback: the non-OpenSearch factory was being instantiated in both create_os_clients and wait_for_rest_api. Cache on the actor in the former and reuse in the latter. Signed-off-by: zznate <zznate.m@gmail.com>
PR #1023 introduced the DatabaseClient abstraction and registry but left several code paths hardcoded to the OpenSearch client. Surface each of these so a non-OpenSearch backend (registered via the existing factory + registry) can run a benchmark end-to-end.
--database-typetoosb run. Choices derive from the DatabaseType enum so new backends become selectable simply by registering. Setscfg.opts("database", "type")which the worker coordinator already reads.OpenSearch behaviour is preserved. See #1064 for details.
Closes #1064
Description
[Describe what this change achieves]
Issues Resolved
[List any issues this PR will resolve]
Testing
[Describe how this change was tested]
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.