Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Refactoring http settings and adding stats endpoint #108219

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented May 2, 2024

This PR adds some comments around the connection eviction and keep-alive strategy documentation for the apache http client implementation. I also increased the connection pool limit to exceed a single route's maximum. I've run into a situation during testing where a single service can lease all the connections in the pool and effectively blocking any other services from leasing a connection.

I also created a stats endpoint at /_inference_stats. I was hoping to use /_inference/_stats but theoretically a user could have that as their inference endpoint id 🤷‍♂️

Right now the endpoint returns stats for the internal apache connection pool. I think this could be useful because it could illuminate whether a cluster has reached the maximum connections in the pool.

Example response:

{
  "eTxRlWZzR7uarhC5x9KCtA": {
    "connection_pool_stats": {
      "leased_connections": 0,
      "pending_connections": 0,
      "available_connections": 1,
      "max_connections": 50
    }
  },
  "mTVhyQfxRCyKy83Yoc8Ttw": {
    "connection_pool_stats": {
      "leased_connections": 0,
      "pending_connections": 0,
      "available_connections": 0,
      "max_connections": 50
    }
  },
  "bAeiuk1jQAKkd3NMfqUDnw": {
    "connection_pool_stats": {
      "leased_connections": 0,
      "pending_connections": 0,
      "available_connections": 0,
      "max_connections": 50
    }
  }
}

@jonathan-buttner jonathan-buttner added >non-issue :ml Machine learning Team:ML Meta label for the ML team v8.15.0 labels May 2, 2024
"xpack.inference.http.max_connections",
public static final Setting<Integer> MAX_TOTAL_CONNECTIONS = Setting.intSetting(
"xpack.inference.http.max_total_connections",
50, // default
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increasing this to 50 by default to exceed the maximum for a single to avoid a single service exhausting the entire connection pool.

* The max number of connections a single route can lease.
*/
public static final Setting<Integer> MAX_ROUTE_CONNECTIONS = Setting.intSetting(
"xpack.inference.http.max_route_connections",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing the per route limit to be controlled separately. I don't expect users to modify these settings often if ever. I could see these being used if we run into a situation in serverless or an SDH where we can help increase throughput.

private IdleConnectionEvictor connectionEvictor;
private final HttpClient httpClient;

private volatile TimeValue evictionInterval;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The common pattern is to use volatile for settings, so moving to that.

@jonathan-buttner jonathan-buttner changed the title [ML] Refactoring http settings and adding comments [ML] Refactoring http settings and adding stats endpoint May 2, 2024
@jonathan-buttner
Copy link
Contributor Author

@elasticmachine merge upstream

@jonathan-buttner jonathan-buttner added the cloud-deploy Publish cloud docker image for Cloud-First-Testing label May 6, 2024
@jonathan-buttner jonathan-buttner marked this pull request as ready for review May 7, 2024 12:22
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Member

@maxhniebergall maxhniebergall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cloud-deploy Publish cloud docker image for Cloud-First-Testing :ml Machine learning >non-issue Team:ML Meta label for the ML team v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants