-
Notifications
You must be signed in to change notification settings - Fork 22
Directly return responses from Local Cluster client #141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Directly return responses from Local Cluster client #141
Conversation
|
I assume from the consumer side like ML-Common/Flow framework we don't have to make any changes based on this, right? |
I've made an effort to be backwards compatible for the current implementation. I'm still debating whether to mark the parser methods as deprecated. So while you don't "have to" make changes, it would be a Good Thing™️ to replace the pattern: with |
|
Yes, this implementation basically encodes the "if we don't have the NodeClient response, create one from a parser." The possible Only really potentially controversial question is whether it is even possible for the response to be null. I don't think so but the code allows for that. |
fc86b90 to
80a12b5
Compare
3c4dcc4 to
ab9674d
Compare
|
@dhrubo-os @arjunkumargiri this is ready for review |
2bbe751 to
412c8a3
Compare
| try { | ||
| return SdkClientUtils.createParser(bulkResponse); | ||
| } catch (IOException | NullPointerException e) { | ||
| return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the cascade effect if we just return null to the downstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of those “should never happen” situations.
The SDKClientUtil null checks it if it’s used.
Otherwise the caller has to null check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative is throwing an unchecked exception the caller doesn’t know to catch, or a checked exception that will never happen that we force the caller to handle
|
Can we please add more test details in the issue description? Have we tried to make any aggregated search with any Plugin like FF/ml-commons with this changes? |
|
Can we also test the bulk as well as we made changes there too? |
| mapper.serialize(obj, generator); | ||
| } | ||
| return jsonXContent.createParser(NamedXContentRegistry.EMPTY, DeprecationHandler.IGNORE_DEPRECATIONS, stringWriter.toString()); | ||
| return SdkClientUtils.createParser(stringWriter.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume DDB client parsing logic is different which is why we aren't adding the same changes to the ddb client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually DDB client logic is identical and we should make this same change there.
This isn't so much a change as a refactoring, moving the jsonXContent.createParser code to the SdkClientUtils class and delegating the call there.
The reason it was less important for the bug fix is that DDB doesn't implement search.
There is no change to the functionality of bulk, which is thoroughly unit tested. In fact, the change revealed a bug in the test (wrong type) which became more obvious with the direct pass-through. |
I started #132 by writing a unit test that failed without the fix and that test is carried over here. Not clear what you're asking for? A sanity check with Flow Framework? I've done that, as well as run the tenant aware integration tests using the new client.
Yes, although there are likely many different combinations. |
Yeah this is what I was asking. If you ran the tenant aware integration tests with the new changes of PR, I'm good with this. Thanks for clarifying. |
I should be more clear: you will need to change the search handler's code to the new getResponse() version to fix the aggregation bug in the local client with this PR |
|
Added integ test to ML Commons PR opensearch-project/ml-commons#3516. opensearch-project/ml-commons@0bef7a9 adds an integ test that uses an aggregation to search. This results in the failure stack trace in #132 opensearch-project/ml-commons@0e10ea8 updates the calls to SdkClient search to directly return the search response. The integ test passes. |
Signed-off-by: Daniel Widdis <[email protected]>
412c8a3 to
66a256f
Compare
|
Ran a sanity test against ML Commons PR opensearch-project/ml-commons#3516 branch deployed in OpenSearch 3.0.0-beta1 docker container Created a connector and registered a model with that connector POST https://localhost:9200/_plugins/_ml/models/_register
{
"name": "bedrock:embedding cohere model",
"function_name": "remote",
"description": "embedding open ai model",
"connector_id": "5vYkXJYBhf97Ja-IYBpD"
}{
"task_id": "6_YnXJYBhf97Ja-IFhpG",
"status": "CREATED",
"model_id": "7PYnXJYBhf97Ja-IFhp9"
}Deployed the model POST https://localhost:9200/_plugins/_ml/models/7PYnXJYBhf97Ja-IFhp9/_deploy
{
"task_id": "7vYtXJYBhf97Ja-I5hqc",
"task_type": "DEPLOY_MODEL",
"status": "COMPLETED"
}Undeployed the model https://localhost:9200/_plugins/_ml/models/7PYnXJYBhf97Ja-IFhp9/_undeploy
{
"6vhLaJ3pRDy773z0aZCXPA": {
"stats": {
"7PYnXJYBhf97Ja-IFhp9": "undeployed"
}
}
}Undeployed the model again https://localhost:9200/_plugins/_ml/models/7PYnXJYBhf97Ja-IFhp9/_undeploy
{
"6vhLaJ3pRDy773z0aZCXPA": {
"stats": {
"7PYnXJYBhf97Ja-IFhp9": "not_found"
}
}
}Searched with an aggregation GET https://localhost:9200/_plugins/_ml/models/_search
{
"size": 0,
"aggs": {
"unique_model_names": {
"terms": {
"field": "name.keyword",
"size": 10000
}
}
}
}
{
"took": 9,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"unique_model_names": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "bedrock:embedding cohere model",
"doc_count": 1
}
]
}
}
} |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #141 +/- ##
===========================
===========================
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@dhrubo-os thanks for the approval! Some quick follow-up:
|
Signed-off-by: Daniel Widdis <[email protected]> (cherry picked from commit 5c7d005) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 5c7d005) Signed-off-by: Daniel Widdis <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/opensearch-remote-metadata-sdk/backport-2.19 2.19
# Navigate to the new working tree
pushd ../.worktrees/opensearch-remote-metadata-sdk/backport-2.19
# Create a new branch
git switch --create backport/backport-141-to-2.19
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5c7d0053506a7ff3b9e1e8d5e1fa28be5f44f750
# Push it to GitHub
git push --set-upstream origin backport/backport-141-to-2.19
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/opensearch-remote-metadata-sdk/backport-2.19Then, create a pull request where the |
…ct#141) Signed-off-by: Daniel Widdis <[email protected]>
…ct#141) Signed-off-by: Daniel Widdis <[email protected]>
WIP: Still need to address bulk response
Description
Avoids the serializing and deserializing of responses in the (default) client. This avoids parsing issues as well as loss of runtime information of subclasses.
Issues Resolved
Alternate solution to resolve #132
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.