[ML] Refactoring http settings and adding stats endpoint #108219

jonathan-buttner · 2024-05-02T19:56:58Z

This PR adds some comments around the connection eviction and keep-alive strategy documentation for the apache http client implementation. I also increased the connection pool limit to exceed a single route's maximum. I've run into a situation during testing where a single service can lease all the connections in the pool and effectively blocking any other services from leasing a connection.

I also created a stats endpoint at /_inference_stats. I was hoping to use /_inference/_stats but theoretically a user could have that as their inference endpoint id 🤷‍♂️

Right now the endpoint returns stats for the internal apache connection pool. I think this could be useful because it could illuminate whether a cluster has reached the maximum connections in the pool.

Example response:

{
  "eTxRlWZzR7uarhC5x9KCtA": {
    "connection_pool_stats": {
      "leased_connections": 0,
      "pending_connections": 0,
      "available_connections": 1,
      "max_connections": 50
    }
  },
  "mTVhyQfxRCyKy83Yoc8Ttw": {
    "connection_pool_stats": {
      "leased_connections": 0,
      "pending_connections": 0,
      "available_connections": 0,
      "max_connections": 50
    }
  },
  "bAeiuk1jQAKkd3NMfqUDnw": {
    "connection_pool_stats": {
      "leased_connections": 0,
      "pending_connections": 0,
      "available_connections": 0,
      "max_connections": 50
    }
  }
}

After discussing with the team, we decided on _inference/.diagnostics. If a user attempts to create an endpoint with a leading dot they'll get this error:

{
    "error": {
        "root_cause": [
            {
                "type": "action_request_validation_exception",
                "reason": "Validation Failed: 1: Invalid inference_id; '.diagnostic' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric;"
            }
        ],
        "type": "action_request_validation_exception",
        "reason": "Validation Failed: 1: Invalid inference_id; '.diagnostic' can contain lowercase alphanumeric (a-z and 0-9), hyphens or underscores; must start and end with alphanumeric;"
    },
    "status": 400
}

If they attempt to create an endpoint called .diagnostics they'll get this error:

{
    "error": "Incorrect HTTP method for uri [/_inference/.diagnostics] and method [PUT], allowed: [GET]",
    "status": 405
}

jonathan-buttner · 2024-05-02T19:58:22Z

...ference/src/main/java/org/elasticsearch/xpack/inference/external/http/HttpClientManager.java

-        "xpack.inference.http.max_connections",
+    public static final Setting<Integer> MAX_TOTAL_CONNECTIONS = Setting.intSetting(
+        "xpack.inference.http.max_total_connections",
+        50, // default


Increasing this to 50 by default to exceed the maximum for a single to avoid a single service exhausting the entire connection pool.

jonathan-buttner · 2024-05-02T19:59:49Z

...ference/src/main/java/org/elasticsearch/xpack/inference/external/http/HttpClientManager.java

+     * The max number of connections a single route can lease.
+     */
+    public static final Setting<Integer> MAX_ROUTE_CONNECTIONS = Setting.intSetting(
+        "xpack.inference.http.max_route_connections",


Allowing the per route limit to be controlled separately. I don't expect users to modify these settings often if ever. I could see these being used if we run into a situation in serverless or an SDH where we can help increase throughput.

jonathan-buttner · 2024-05-02T20:00:18Z

...ference/src/main/java/org/elasticsearch/xpack/inference/external/http/HttpClientManager.java

    private IdleConnectionEvictor connectionEvictor;
    private final HttpClient httpClient;

+    private volatile TimeValue evictionInterval;


The common pattern is to use volatile for settings, so moving to that.

jonathan-buttner · 2024-05-03T21:08:49Z

@elasticmachine merge upstream

…han-buttner/elasticsearch into ml-inference-http-pooling-settings

…ence-http-pooling-settings

elasticsearchmachine · 2024-05-07T12:23:04Z

Pinging @elastic/ml-core (Team:ML)

maxhniebergall

LGTM!

…ence-http-pooling-settings

maxhniebergall

LGTM

Refactoring http settings and adding comments

d0c42c2

jonathan-buttner added >non-issue :ml Machine learning Team:ML Meta label for the ML team v8.15.0 labels May 2, 2024

Removing new line

9fd9efc

jonathan-buttner commented May 2, 2024

View reviewed changes

Adding stats endpoint

9ec77ac

jonathan-buttner changed the title ~~[ML] Refactoring http settings and adding comments~~ [ML] Refactoring http settings and adding stats endpoint May 2, 2024

Improving comment and adding tests

8e7ff2b

elasticmachine and others added 4 commits May 3, 2024 22:08

Merge branch 'main' into ml-inference-http-pooling-settings

8214519

Fixing typo in setting name

ea2443b

Merge branch 'ml-inference-http-pooling-settings' of github.com:jonat…

dadf7db

…han-buttner/elasticsearch into ml-inference-http-pooling-settings

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

95f242d

…ence-http-pooling-settings

jonathan-buttner added the cloud-deploy Publish cloud docker image for Cloud-First-Testing label May 6, 2024

jonathan-buttner added 5 commits May 6, 2024 15:47

Switching to TransportNodesAction

fb07fca

Fixing test

be276cc

Trying to fix test

e7d6a49

Adding equals hashcode

e5adc48

Trying to send to all nodes

f143323

jonathan-buttner marked this pull request as ready for review May 7, 2024 12:22

jonathan-buttner requested review from davidkyle and maxhniebergall May 7, 2024 12:22

maxhniebergall approved these changes May 7, 2024

View reviewed changes

jonathan-buttner added 4 commits May 16, 2024 15:03

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

4ac68a3

…ence-http-pooling-settings

Renaming to .diagnostics

0c3c5cb

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

27908ac

…ence-http-pooling-settings

Removing unneeded variable

45f797e

jonathan-buttner requested a review from maxhniebergall May 28, 2024 16:10

maxhniebergall approved these changes May 28, 2024

View reviewed changes

jonathan-buttner merged commit 96075de into elastic:main May 28, 2024
16 checks passed

jonathan-buttner deleted the ml-inference-http-pooling-settings branch May 28, 2024 19:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Refactoring http settings and adding stats endpoint #108219

[ML] Refactoring http settings and adding stats endpoint #108219

Uh oh!

jonathan-buttner commented May 2, 2024 •

edited

Loading

Uh oh!

jonathan-buttner May 2, 2024

Uh oh!

jonathan-buttner May 2, 2024

Uh oh!

jonathan-buttner May 2, 2024

Uh oh!

jonathan-buttner commented May 3, 2024

Uh oh!

elasticsearchmachine commented May 7, 2024

Uh oh!

maxhniebergall left a comment

Uh oh!

maxhniebergall left a comment

Uh oh!

Uh oh!

Uh oh!

[ML] Refactoring http settings and adding stats endpoint #108219

[ML] Refactoring http settings and adding stats endpoint #108219

Uh oh!

Conversation

jonathan-buttner commented May 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonathan-buttner May 2, 2024

Choose a reason for hiding this comment

Uh oh!

jonathan-buttner May 2, 2024

Choose a reason for hiding this comment

Uh oh!

jonathan-buttner May 2, 2024

Choose a reason for hiding this comment

Uh oh!

jonathan-buttner commented May 3, 2024

Uh oh!

elasticsearchmachine commented May 7, 2024

Uh oh!

maxhniebergall left a comment

Choose a reason for hiding this comment

Uh oh!

maxhniebergall left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jonathan-buttner commented May 2, 2024 •

edited

Loading