-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REP] Ray Export API. #55
base: main
Are you sure you want to change the base?
Conversation
## Summary | ||
### General Motivation | ||
|
||
In the current design of Ray, the way to export various states in the Ray cluster are inconsistent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the State API is typically recommended for end users to fetch current state info of a running Ray cluster, but the state API gets data from multiple sources depending on the resource type.
The export API will provide another way to fetch state data that works well for large scale clusters or can be saved to query after the cluster is terminated, but I don’t see disjoint interfaces for state data as the main motivation for this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. I mentioned it in the Key requirements:
To obtain the current status of the ray cluster at one time, you need to use state api.
we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server. | ||
|
||
#### Key requirements: | ||
- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good initial use case of this data is rebuilding the dashboard APIs, so we should make sure the schema for each resource contains all required fields. Timestamps should also be included so the reader can reconstruct a history of state changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try to understand. Do you mean that we can use the data format of the dashboard api to correct the completeness of the exported data by the export api? For example, exported state is published to gcs, and dashboard gets data from gcs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we don’t need to exactly use the dashboard API schemas because that would require some post processing which we can leave to the users for flexibility. We should be able to postprocess the export API data to match the dashboard API responses though (which is needed for applications like history server)
|
||
|
||
- Export event | ||
We propose to use pubsub for event export, because pubsub is easier to decouple systems, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinion is we can limit this REP to generating the events and let users implement exporting the events to an external system. We can share recommendations on how to do this (eg: log ingestion using Vector which can send data to various sinks per the use case), but I think the pub sub event export isn’t necessary for this version to keep the interface simple and flexible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or instead, we give some possible export methods (such as gcs or vector) as a suggestion, not a proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it’s fine to describe some export options as a suggestion, but this shouldn’t be the key part of this REP. Could you condense and move this to a “How to use Export API” section, so it’s clear these are ideas for the users?
I have two questions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly rewording and restructuring comments
we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server. | ||
|
||
#### Key requirements: | ||
- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes. | |
- Need to expose necessary Ray state information which is currently returned in the dashboard APIs for tasks, actors, jobs, and nodes. |
- Light load may be added when we export states, but we should be able to limit this impact (eg: max data stored in memory, not blocking any control logic). | ||
The export API should not put too much load on the Ray cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Light load may be added when we export states, but we should be able to limit this impact (eg: max data stored in memory, not blocking any control logic). | |
The export API should not put too much load on the Ray cluster. | |
- We should be able to limit any additional load from exporting state information (eg: max data stored in memory). Default configurations for the export API should have minimal impact on the Ray cluster and we will run performance tests to determine this. |
- Export states streamingly rather than fetching it directly from ray cluster. To obtain the current status of the ray cluster at one time, | ||
you need to use [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md "the state observability api") . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Export states streamingly rather than fetching it directly from ray cluster. To obtain the current status of the ray cluster at one time, | |
you need to use [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md "the state observability api") . | |
- Export events should be emitted on state change, rather than being pulled from the current status of the ray cluster (as done with [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md)). |
This was how I interpreted it, please let me know if it refers to something else
- Certain states can be exported respectively. RayCores (actors/tasks/nodes), RayData, and RayServe-related states can be selectively exported. | ||
For example, if a user is only interested in the state of RayServe and not in the state of RayCore, then the export of RayCore state can be disabled to reduce overhead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Certain states can be exported respectively. RayCores (actors/tasks/nodes), RayData, and RayServe-related states can be selectively exported. | |
For example, if a user is only interested in the state of RayServe and not in the state of RayCore, then the export of RayCore state can be disabled to reduce overhead. | |
- [P1] Allow users to selectively enable exporting RayCore (actors/tasks/jobs/nodes), RayData, or RayServe-related states |
Let’s label this P1 and move to the bottom of the list because it will be more relevant after we finish implementing the library events
you need to use [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md "the state observability api") . | ||
- Certain states can be exported respectively. RayCores (actors/tasks/nodes), RayData, and RayServe-related states can be selectively exported. | ||
For example, if a user is only interested in the state of RayServe and not in the state of RayCore, then the export of RayCore state can be disabled to reduce overhead. | ||
- Friendly to all types of users (especially cloud vendors), easy to deploy and use, without modifying Ray. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this point because it doesn’t add a specific requirement. Was there something in particular you wanted to call out here?
| Objects | CoreWorker and raylet | None | | | ||
| Jobs | JobManager | When JobInfoStorageClient.put_status called | https://docs.google.com/document/d/1upQRU-f8WgVH_NWBmeJyegyOSJwNDPPqnl1cCGpmiGo/edit?usp=sharing | | ||
| Nodes | GCS | GcsNodeManager::HandleXXXNode | https://docs.google.com/document/d/1qjoF51h2oUN2sr_MtPnovbNFZYZrh3WLNR_P0HrUuOI/edit?usp=sharing | | ||
| Placement groups | GCS | GcsPlacementGroupManager::HandleXXXPlacementGroup | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example schema for placement groups? Otherwise let’s remove from this table or mark as P1
- RayServe | ||
State change events for replicas, deployments, applications. | ||
|
||
| Event | Event Source | When to export | Format example(maybe a file link)| | ||
| ---- | ---- | ---- | ---- | | ||
| Serve App | ServeController Actor | None | None | | ||
|
||
- RayData | ||
All datasets, the dag, and execution progress. | ||
|
||
| Event | Event Source | When to export | Format example(maybe a file link)| | ||
| ---- | ---- | ---- | ---- | | ||
| Datasets | ray.data.internal.stats._StatsActor | _StatsActor's update function called. | None | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s actually remove the library events info for now because we don’t have schemas for these yet. Can just say We are planning to emit events for Ray Serve and Ray Data in the future after validating implementation and use cases of the core events.
|
||
|
||
- Export event | ||
We propose to use pubsub for event export, because pubsub is easier to decouple systems, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it’s fine to describe some export options as a suggestion, but this shouldn’t be the key part of this REP. Could you condense and move this to a “How to use Export API” section, so it’s clear these are ideas for the users?
we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server. | ||
|
||
#### Key requirements: | ||
- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we don’t need to exactly use the dashboard API schemas because that would require some post processing which we can leave to the users for flexibility. We should be able to postprocess the export API data to match the dashboard API responses though (which is needed for applications like history server)
In the current design of Ray, the way to export various states in the Ray cluster are inconsistent. | ||
For example, the state of the Actor is broadcast through GCS pubsub, and to obtain the state change of the node, | ||
it is necessary to query rpc service (NodeInfoGcsService). For the jobs submitted through job submission, | ||
there is no way to expose the state. The high-level libraries on top of Ray also don't have a unified export way. | ||
E.g. RayServe and RayData collect the states through their own StateActor and report to the Dashboard respectively. | ||
|
||
It is very difficult to obtain these ray basic states outside the Ray cluster. This issue is reflected in two aspects: | ||
1. Scale of data exceeds current dashboard API limits. | ||
2. After the Ray cluster terminates, all status data is lost. | ||
If a unified export API can be defined, | ||
we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would re-organize this to something like
In the current design of Ray, the way to export various states in the Ray cluster are inconsistent. | |
For example, the state of the Actor is broadcast through GCS pubsub, and to obtain the state change of the node, | |
it is necessary to query rpc service (NodeInfoGcsService). For the jobs submitted through job submission, | |
there is no way to expose the state. The high-level libraries on top of Ray also don't have a unified export way. | |
E.g. RayServe and RayData collect the states through their own StateActor and report to the Dashboard respectively. | |
It is very difficult to obtain these ray basic states outside the Ray cluster. This issue is reflected in two aspects: | |
1. Scale of data exceeds current dashboard API limits. | |
2. After the Ray cluster terminates, all status data is lost. | |
If a unified export API can be defined, | |
we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server. | |
There are several major improvements requested by users about the Ray Dashboard | |
1. Currently, all state data is lost when the Ray cluster terminates and users are looking for persistence of this data. | |
2. The Ray dashboard has scalability limits on the amount of data stored and returned. Advanced users would benefit from greater scalability. | |
3. There is no unified and accurate way to get state events pushed on state change across various resource types. The State and dashboard APIs support a pull model, where it is possible to miss some events. There is also not a consistent way across resource types to get this information directly from components that generate it. | |
A unified export API could serve as a base to address these issues by allowing state data to be exported and processed separately. This would ensure that observability features remain functional independent of the Ray cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to obtain the state change of the node, it is necessary to query rpc service (NodeInfoGcsService)
Just curious, do you or other users already query these GCS services directly to get data?
Yeah, we wanted to open source the export API as the first step for various dashboard persistence and scalability features. We are looking into prototyping a history server to better understand the design options. One open question for a full history server is choosing the right backend and how to interface with it. We need to choose a light weight backend, or we should probably support plugging into multiple backends.
Yes, I have an initial working version of this in review, but we need to still verify if there are any performance regressions |
No description provided.