Directly return responses from Local Cluster client #141

dbwiddis · 2025-04-11T23:06:03Z

WIP: Still need to address bulk response

Description

Avoids the serializing and deserializing of responses in the (default) client. This avoids parsing issues as well as loss of runtime information of subclasses.

Issues Resolved

Alternate solution to resolve #132

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

dhrubo-os · 2025-04-11T23:09:47Z

I assume from the consumer side like ML-Common/Flow framework we don't have to make any changes based on this, right?

dbwiddis · 2025-04-11T23:16:44Z

I assume from the consumer side like ML-Common/Flow framework we don't have to make any changes based on this, right?

I've made an effort to be backwards compatible for the current implementation. I'm still debating whether to mark the parser methods as deprecated.

So while you don't "have to" make changes, it would be a Good Thing™️ to replace the pattern:

return response.parser() == null ? null : GetResponse.fromXContent(response.parser());

with

return response.getResponse();

dhrubo-os · 2025-04-11T23:23:38Z

I assume from the consumer side like ML-Common/Flow framework we don't have to make any changes based on this, right?

I've made an effort to be backwards compatible for the current implementation. I'm still debating whether to mark the parser methods as deprecated.

So while you don't "have to" make changes, it would be a Good Thing™️ to replace the pattern:
return response.parser() == null ? null : GetResponse.fromXContent(response.parser());
with
return response.getResponse();

return response.getResponse(); will that be also compatible with other client implementation like DDBClient?

dbwiddis · 2025-04-11T23:27:24Z

return response.getResponse(); will that be also compatible with other client implementation like DDBClient?

Yes, this implementation basically encodes the "if we don't have the NodeClient response, create one from a parser."

The possible IOException will need to be handled (but I think it's already done!)

Only really potentially controversial question is whether it is even possible for the response to be null. I don't think so but the code allows for that.

dbwiddis · 2025-04-16T17:17:11Z

@dhrubo-os @arjunkumargiri this is ready for review

dhrubo-os · 2025-04-18T22:08:26Z

core/src/main/java/org/opensearch/remote/metadata/client/BulkDataObjectResponse.java

+            try {
+                return SdkClientUtils.createParser(bulkResponse);
+            } catch (IOException | NullPointerException e) {
+                return null;


What is the cascade effect if we just return null to the downstream?

This is one of those “should never happen” situations.

The SDKClientUtil null checks it if it’s used.

Otherwise the caller has to null check.

Alternative is throwing an unchecked exception the caller doesn’t know to catch, or a checked exception that will never happen that we force the caller to handle

dhrubo-os · 2025-04-21T17:33:26Z

Can we please add more test details in the issue description?

Have we tried to make any aggregated search with any Plugin like FF/ml-commons with this changes?

dhrubo-os · 2025-04-21T17:35:19Z

Can we also test the bulk as well as we made changes there too?

dhrubo-os · 2025-04-21T17:41:50Z

...ent/src/main/java/org/opensearch/remote/metadata/client/impl/RemoteClusterIndicesClient.java

            mapper.serialize(obj, generator);
        }
-        return jsonXContent.createParser(NamedXContentRegistry.EMPTY, DeprecationHandler.IGNORE_DEPRECATIONS, stringWriter.toString());
+        return SdkClientUtils.createParser(stringWriter.toString());


I assume DDB client parsing logic is different which is why we aren't adding the same changes to the ddb client?

Actually DDB client logic is identical and we should make this same change there.

This isn't so much a change as a refactoring, moving the jsonXContent.createParser code to the SdkClientUtils class and delegating the call there.

The reason it was less important for the bug fix is that DDB doesn't implement search.

dbwiddis · 2025-04-21T18:11:36Z

Can we also test the bulk as well as we made changes there too?

There is no change to the functionality of bulk, which is thoroughly unit tested. In fact, the change revealed a bug in the test (wrong type) which became more obvious with the direct pass-through.

dbwiddis · 2025-04-21T18:14:47Z

Can we please add more test details in the issue description?

I started #132 by writing a unit test that failed without the fix and that test is carried over here.

Not clear what you're asking for? A sanity check with Flow Framework? I've done that, as well as run the tenant aware integration tests using the new client.

Have we tried to make any aggregated search with any Plugin like FF/ml-commons with this changes?

Yes, although there are likely many different combinations.

dhrubo-os · 2025-04-21T18:24:12Z

Not clear what you're asking for? A sanity check with Flow Framework? I've done that, as well as run the tenant aware integration tests using the new client.

Yeah this is what I was asking. If you ran the tenant aware integration tests with the new changes of PR, I'm good with this. Thanks for clarifying.

dbwiddis · 2025-04-21T18:25:50Z

So while you don't "have to" make changes

I should be more clear: you will need to change the search handler's code to the new getResponse() version to fix the aggregation bug in the local client with this PR

dbwiddis · 2025-04-22T05:42:46Z

Added integ test to ML Commons PR opensearch-project/ml-commons#3516.

opensearch-project/ml-commons@0bef7a9 adds an integ test that uses an aggregation to search. This results in the failure stack trace in #132

  2> REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests "org.opensearch.ml.rest.RestMLModelTenantAwareIT.testModelCRUD" -Dtests.seed=8663FF7406EA2943 -Dtests.security.manager=false -Dtests.locale=yrl-CO -Dtests.timezone=Australia/Lord_Howe -Druntime.java=21
  2> org.opensearch.client.ResponseException: method [GET], host [http://[::1]:61137], URI [/_plugins/_ml/models/_search], status line [HTTP/1.1 500 Internal Server Error]
    {"error":{"root_cause":[{"type":"status_exception","reason":"Fail to search model version"}],"type":"status_exception","reason":"Fail to search model version"},"status":500}

»  org.opensearch.core.common.ParsingException: Could not parse aggregation keyed as [unique_model_names]
»       at org.opensearch.search.aggregations.Aggregations.fromXContent(Aggregations.java:165) ~[opensearch-3.0.0-beta1-SNAPSHOT.jar:3.0.0-beta1-SNAPSHOT]

» WARN ][r.suppressed             ] [integTest-0] path: /_plugins/_ml/models/_search, params: {}
»  org.opensearch.OpenSearchStatusException: Fail to search model version

opensearch-project/ml-commons@0e10ea8 updates the calls to SdkClient search to directly return the search response. The integ test passes.

Signed-off-by: Daniel Widdis <[email protected]>

dbwiddis · 2025-04-22T06:31:40Z

Ran a sanity test against ML Commons PR opensearch-project/ml-commons#3516 branch deployed in OpenSearch 3.0.0-beta1 docker container

Created a connector and registered a model with that connector

POST https://localhost:9200/_plugins/_ml/models/_register
{
  "name": "bedrock:embedding cohere model",
  "function_name": "remote",
  "description": "embedding open ai model",
  "connector_id": "5vYkXJYBhf97Ja-IYBpD"
}

{
    "task_id": "6_YnXJYBhf97Ja-IFhpG",
    "status": "CREATED",
    "model_id": "7PYnXJYBhf97Ja-IFhp9"
}

Deployed the model

POST https://localhost:9200/_plugins/_ml/models/7PYnXJYBhf97Ja-IFhp9/_deploy

{
    "task_id": "7vYtXJYBhf97Ja-I5hqc",
    "task_type": "DEPLOY_MODEL",
    "status": "COMPLETED"
}

Undeployed the model

https://localhost:9200/_plugins/_ml/models/7PYnXJYBhf97Ja-IFhp9/_undeploy

{
    "6vhLaJ3pRDy773z0aZCXPA": {
        "stats": {
            "7PYnXJYBhf97Ja-IFhp9": "undeployed"
        }
    }
}

Undeployed the model again

https://localhost:9200/_plugins/_ml/models/7PYnXJYBhf97Ja-IFhp9/_undeploy

{
    "6vhLaJ3pRDy773z0aZCXPA": {
        "stats": {
            "7PYnXJYBhf97Ja-IFhp9": "not_found"
        }
    }
}

Searched with an aggregation

GET https://localhost:9200/_plugins/_ml/models/_search
{
  "size": 0,
  "aggs": {
    "unique_model_names": {
      "terms": {
        "field": "name.keyword",
        "size": 10000
      }
    }
  }
}

{
    "took": 9,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "unique_model_names": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "bedrock:embedding cohere model",
                    "doc_count": 1
                }
            ]
        }
    }
}

codecov · 2025-04-22T17:10:37Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (9656e34) to head (66a256f).
Report is 9 commits behind head on main.

Additional details and impacted files

@@     Coverage Diff     @@
##   main   #141   +/-   ##
===========================
===========================

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dbwiddis · 2025-04-22T17:15:03Z

@dhrubo-os thanks for the approval! Some quick follow-up:

I would recommend you cherry-pick opensearch-project/ml-commons@0e10ea8 into a PR on main branch. This will directly pass search response for Local/NodeClient implementations. You may consider doing similar for other wrappers over time, but this one is more "urgent" due to the search aggregation bug.
Without the cherry-pick recommended in (1) the parsing still requires a type hint. We can't pass typed_keys on the ml search REST path because it's not allowed. This may still impact the remote client and we may want to consider always adding that to the rest path when sending a search request via the client. (Would like to target that for 3.1.0)

Signed-off-by: Daniel Widdis <[email protected]> (cherry picked from commit 5c7d005) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

(cherry picked from commit 5c7d005) Signed-off-by: Daniel Widdis <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

opensearch-trigger-bot · 2025-04-28T18:29:13Z

The backport to 2.19 failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/opensearch-remote-metadata-sdk/backport-2.19 2.19
# Navigate to the new working tree
pushd ../.worktrees/opensearch-remote-metadata-sdk/backport-2.19
# Create a new branch
git switch --create backport/backport-141-to-2.19
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5c7d0053506a7ff3b9e1e8d5e1fa28be5f44f750
# Push it to GitHub
git push --set-upstream origin backport/backport-141-to-2.19
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/opensearch-remote-metadata-sdk/backport-2.19

Then, create a pull request where the base branch is 2.19 and the compare/head branch is backport/backport-141-to-2.19.

…ct#141) Signed-off-by: Daniel Widdis <[email protected]>

github-actions bot added the backport 2.x label Apr 11, 2025

dbwiddis mentioned this pull request Apr 15, 2025

Properly handle aggregations in search #134

Closed

dbwiddis force-pushed the passthrough-response branch 2 times, most recently from fc86b90 to 80a12b5 Compare April 15, 2025 20:53

dbwiddis marked this pull request as ready for review April 15, 2025 21:39

dbwiddis requested review from arjunkumargiri and dhrubo-os as code owners April 15, 2025 21:39

dbwiddis removed the backport 2.x label Apr 16, 2025

dbwiddis force-pushed the passthrough-response branch 3 times, most recently from 3c4dcc4 to ab9674d Compare April 16, 2025 17:06

dbwiddis force-pushed the passthrough-response branch 2 times, most recently from 2bbe751 to 412c8a3 Compare April 16, 2025 23:56

dhrubo-os reviewed Apr 18, 2025

View reviewed changes

dhrubo-os reviewed Apr 21, 2025

View reviewed changes

Directly return responses from Local Cluster client

66a256f

Signed-off-by: Daniel Widdis <[email protected]>

dbwiddis force-pushed the passthrough-response branch from 412c8a3 to 66a256f Compare April 22, 2025 05:51

dhrubo-os approved these changes Apr 22, 2025

View reviewed changes

dbwiddis merged commit 5c7d005 into opensearch-project:main Apr 22, 2025
11 checks passed

dbwiddis deleted the passthrough-response branch April 22, 2025 17:10

dbwiddis added the backport 3.0 3.0.0 Release label Apr 22, 2025

opensearch-trigger-bot bot mentioned this pull request Apr 22, 2025

[Backport 3.0] Directly return responses from Local Cluster client #145

Merged

dhrubo-os mentioned this pull request Apr 23, 2025

Replace null GetResponse with valid response and not exists opensearch-project/ml-commons#3759

Merged

2 tasks

This was referenced Apr 23, 2025

Directly return Response objects from metadata client responses opensearch-project/ml-commons#3768

Merged

Properly handle remote client search failures with status codes #158

Merged

dbwiddis added the backport 2.19 label Apr 28, 2025

dbwiddis added a commit to dbwiddis/opensearch-remote-metadata-sdk that referenced this pull request Apr 28, 2025

Directly return responses from Local Cluster client (opensearch-proje…

fe9a265

…ct#141) Signed-off-by: Daniel Widdis <[email protected]>

dbwiddis mentioned this pull request Apr 28, 2025

[Backport 2.19] Directly return responses from Local Cluster client (#141) #160

Closed

dbwiddis added a commit to dbwiddis/opensearch-remote-metadata-sdk that referenced this pull request Apr 28, 2025

Directly return responses from Local Cluster client (opensearch-proje…

5691825

…ct#141) Signed-off-by: Daniel Widdis <[email protected]>

Directly return responses from Local Cluster client #141

Directly return responses from Local Cluster client #141

Uh oh!

Conversation

dbwiddis commented Apr 11, 2025

Description

Issues Resolved

Uh oh!

dhrubo-os commented Apr 11, 2025

Uh oh!

dbwiddis commented Apr 11, 2025

Uh oh!

dhrubo-os commented Apr 11, 2025

Uh oh!

dbwiddis commented Apr 11, 2025

Uh oh!

dbwiddis commented Apr 16, 2025

Uh oh!

dhrubo-os Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

dbwiddis Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

dbwiddis Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

dhrubo-os commented Apr 21, 2025

Uh oh!

dhrubo-os commented Apr 21, 2025

Uh oh!

dhrubo-os Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

dbwiddis Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

dbwiddis commented Apr 21, 2025

Uh oh!

dbwiddis commented Apr 21, 2025

Uh oh!

dhrubo-os commented Apr 21, 2025

Uh oh!

dbwiddis commented Apr 21, 2025

Uh oh!

dbwiddis commented Apr 22, 2025

Uh oh!

dbwiddis commented Apr 22, 2025

Uh oh!

Uh oh!

codecov bot commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dbwiddis commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

opensearch-trigger-bot bot commented Apr 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Apr 22, 2025 •

edited

Loading

dbwiddis commented Apr 22, 2025 •

edited

Loading