[REP] Ray Export API. #55

MissiontoMars · 2024-08-06T13:20:45Z

No description provided.

nikitavemuri · 2024-08-12T16:29:40Z

reps/2024-06-13-export-api.md

+## Summary
+### General Motivation
+
+In the current design of Ray, the way to export various states in the Ray cluster are inconsistent.


I believe the State API is typically recommended for end users to fetch current state info of a running Ray cluster, but the state API gets data from multiple sources depending on the resource type.

The export API will provide another way to fetch state data that works well for large scale clusters or can be saved to query after the cluster is terminated, but I don’t see disjoint interfaces for state data as the main motivation for this feature.

Make sense. I mentioned it in the Key requirements:
To obtain the current status of the ray cluster at one time, you need to use state api.

reps/2024-06-13-export-api.md

nikitavemuri · 2024-08-12T16:30:30Z

reps/2024-06-13-export-api.md

+we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.
+
+#### Key requirements:
+- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes.


A good initial use case of this data is rebuilding the dashboard APIs, so we should make sure the schema for each resource contains all required fields. Timestamps should also be included so the reader can reconstruct a history of state changes.

Let me try to understand. Do you mean that we can use the data format of the dashboard api to correct the completeness of the exported data by the export api? For example, exported state is published to gcs, and dashboard gets data from gcs.

Yeah, we don’t need to exactly use the dashboard API schemas because that would require some post processing which we can leave to the users for flexibility. We should be able to postprocess the export API data to match the dashboard API responses though (which is needed for applications like history server)

reps/2024-06-13-export-api.md

nikitavemuri · 2024-08-12T16:34:19Z

reps/2024-06-13-export-api.md

+
+
+- Export event
+We propose to use pubsub for event export, because pubsub is easier to decouple systems,


My opinion is we can limit this REP to generating the events and let users implement exporting the events to an external system. We can share recommendations on how to do this (eg: log ingestion using Vector which can send data to various sinks per the use case), but I think the pub sub event export isn’t necessary for this version to keep the interface simple and flexible

Or instead, we give some possible export methods (such as gcs or vector) as a suggestion, not a proposal?

Yeah, I think it’s fine to describe some export options as a suggestion, but this shouldn’t be the key part of this REP. Could you condense and move this to a “How to use Export API” section, so it’s clear these are ideas for the users?

SongGuyang · 2024-09-03T11:51:04Z

I have two questions:

It looks like that users can not get a completed history server after the implementation of this REP. Do you have any plans to support a default history server including UI?
What's the status of this REP? Do you have any plans for the implementation of this REP?

nikitavemuri

Mostly rewording and restructuring comments

nikitavemuri · 2024-09-05T21:39:37Z

reps/2024-06-13-export-api.md

+we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.
+
+#### Key requirements:
+- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes.


Suggested change

- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes.

- Need to expose necessary Ray state information which is currently returned in the dashboard APIs for tasks, actors, jobs, and nodes.

nikitavemuri · 2024-09-05T21:40:59Z

reps/2024-06-13-export-api.md

+- Light load may be added when we export states, but we should be able to limit this impact (eg: max data stored in memory, not blocking any control logic).
+  The export API should not put too much load on the Ray cluster.


Suggested change

- Light load may be added when we export states, but we should be able to limit this impact (eg: max data stored in memory, not blocking any control logic).

The export API should not put too much load on the Ray cluster.

- We should be able to limit any additional load from exporting state information (eg: max data stored in memory). Default configurations for the export API should have minimal impact on the Ray cluster and we will run performance tests to determine this.

nikitavemuri · 2024-09-05T21:43:13Z

reps/2024-06-13-export-api.md

+- Export states streamingly rather than fetching it directly from ray cluster. To obtain the current status of the ray cluster at one time,
+  you need to use [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md "the state observability api") . 


Suggested change

- Export states streamingly rather than fetching it directly from ray cluster. To obtain the current status of the ray cluster at one time,

you need to use [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md "the state observability api") .

- Export events should be emitted on state change, rather than being pulled from the current status of the ray cluster (as done with [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md)).

This was how I interpreted it, please let me know if it refers to something else

nikitavemuri · 2024-09-05T21:44:01Z

reps/2024-06-13-export-api.md

+- Certain states can be exported respectively. RayCores (actors/tasks/nodes), RayData, and RayServe-related states can be selectively exported.
+  For example, if a user is only interested in the state of RayServe and not in the state of RayCore, then the export of RayCore state can be disabled to reduce overhead.


Suggested change

- Certain states can be exported respectively. RayCores (actors/tasks/nodes), RayData, and RayServe-related states can be selectively exported.

For example, if a user is only interested in the state of RayServe and not in the state of RayCore, then the export of RayCore state can be disabled to reduce overhead.

- [P1] Allow users to selectively enable exporting RayCore (actors/tasks/jobs/nodes), RayData, or RayServe-related states

Let’s label this P1 and move to the bottom of the list because it will be more relevant after we finish implementing the library events

nikitavemuri · 2024-09-05T21:45:23Z

reps/2024-06-13-export-api.md

+  you need to use [the state observability api](https://github.com/ray-project/enhancements/blob/main/reps/2022-04-21-state-observability-apis.md "the state observability api") . 
+- Certain states can be exported respectively. RayCores (actors/tasks/nodes), RayData, and RayServe-related states can be selectively exported.
+  For example, if a user is only interested in the state of RayServe and not in the state of RayCore, then the export of RayCore state can be disabled to reduce overhead.
+- Friendly to all types of users (especially cloud vendors), easy to deploy and use, without modifying Ray.


I think we can remove this point because it doesn’t add a specific requirement. Was there something in particular you wanted to call out here?

nikitavemuri · 2024-09-05T21:55:40Z

reps/2024-06-13-export-api.md

+| Objects           | CoreWorker and raylet | None                                                                              |          |
+| Jobs              | JobManager            | When JobInfoStorageClient.put_status called                                       |  https://docs.google.com/document/d/1upQRU-f8WgVH_NWBmeJyegyOSJwNDPPqnl1cCGpmiGo/edit?usp=sharing        |
+| Nodes             | GCS                   | GcsNodeManager::HandleXXXNode                                                     |  https://docs.google.com/document/d/1qjoF51h2oUN2sr_MtPnovbNFZYZrh3WLNR_P0HrUuOI/edit?usp=sharing        |
+| Placement groups  | GCS                   | GcsPlacementGroupManager::HandleXXXPlacementGroup                                 |          |


Do you have an example schema for placement groups? Otherwise let’s remove from this table or mark as P1

nikitavemuri · 2024-09-05T21:59:26Z

reps/2024-06-13-export-api.md

+- RayServe
+State change events for replicas, deployments, applications.
+
+|  Event            | Event Source          | When to export | Format example(maybe a file link)|
+|  ----             | ----                  | ----           | ----                               |
+| Serve App         | ServeController Actor | None           | None         |
+
+- RayData
+All datasets, the dag, and execution progress.
+
+|  Event            | Event Source          | When to export | Format example(maybe a file link)|
+|  ----             | ----                  | ----           | ----                               |
+| Datasets         | ray.data.internal.stats._StatsActor | _StatsActor's update function called.           | None         |


Let’s actually remove the library events info for now because we don’t have schemas for these yet. Can just say We are planning to emit events for Ray Serve and Ray Data in the future after validating implementation and use cases of the core events.

nikitavemuri · 2024-09-05T22:43:39Z

reps/2024-06-13-export-api.md

+
+
+- Export event
+We propose to use pubsub for event export, because pubsub is easier to decouple systems,


Yeah, I think it’s fine to describe some export options as a suggestion, but this shouldn’t be the key part of this REP. Could you condense and move this to a “How to use Export API” section, so it’s clear these are ideas for the users?

nikitavemuri · 2024-09-05T22:44:34Z

reps/2024-06-13-export-api.md

+we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.
+
+#### Key requirements:
+- Need to expose all necessary ray states for basic observability, tasks/actor/jobs/nodes.


Yeah, we don’t need to exactly use the dashboard API schemas because that would require some post processing which we can leave to the users for flexibility. We should be able to postprocess the export API data to match the dashboard API responses though (which is needed for applications like history server)

nikitavemuri · 2024-09-05T23:09:51Z

reps/2024-06-13-export-api.md

+In the current design of Ray, the way to export various states in the Ray cluster are inconsistent.
+For example, the state of the Actor is broadcast through GCS pubsub, and to obtain the state change of the node,
+it is necessary to query rpc service (NodeInfoGcsService). For the jobs submitted through job submission,
+there is no way to expose the state. The high-level libraries on top of Ray also don't have a unified export way.
+E.g. RayServe and RayData collect the states through their own StateActor and report to the Dashboard respectively.
+
+It is very difficult to obtain these ray basic states outside the Ray cluster. This issue is reflected in two aspects:
+1. Scale of data exceeds current dashboard API limits.
+2. After the Ray cluster terminates, all status data is lost.
+If a unified export API can be defined,
+we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.


I would re-organize this to something like

Suggested change

In the current design of Ray, the way to export various states in the Ray cluster are inconsistent.

For example, the state of the Actor is broadcast through GCS pubsub, and to obtain the state change of the node,

it is necessary to query rpc service (NodeInfoGcsService). For the jobs submitted through job submission,

there is no way to expose the state. The high-level libraries on top of Ray also don't have a unified export way.

E.g. RayServe and RayData collect the states through their own StateActor and report to the Dashboard respectively.

It is very difficult to obtain these ray basic states outside the Ray cluster. This issue is reflected in two aspects:

1. Scale of data exceeds current dashboard API limits.

2. After the Ray cluster terminates, all status data is lost.

If a unified export API can be defined,

we can achieve the observable ability independent of the Ray cluster. The most typical scenario is to build the Ray history server.

There are several major improvements requested by users about the Ray Dashboard

1. Currently, all state data is lost when the Ray cluster terminates and users are looking for persistence of this data.

2. The Ray dashboard has scalability limits on the amount of data stored and returned. Advanced users would benefit from greater scalability.

3. There is no unified and accurate way to get state events pushed on state change across various resource types. The State and dashboard APIs support a pull model, where it is possible to miss some events. There is also not a consistent way across resource types to get this information directly from components that generate it.

A unified export API could serve as a base to address these issues by allowing state data to be exported and processed separately. This would ensure that observability features remain functional independent of the Ray cluster.

to obtain the state change of the node, it is necessary to query rpc service (NodeInfoGcsService)

Just curious, do you or other users already query these GCS services directly to get data?

nikitavemuri · 2024-09-05T23:33:57Z

@SongGuyang

It looks like that users can not get a completed history server after the implementation of this REP. Do you have any plans to support a default history server including UI?

Yeah, we wanted to open source the export API as the first step for various dashboard persistence and scalability features. We are looking into prototyping a history server to better understand the design options.

One open question for a full history server is choosing the right backend and how to interface with it. We need to choose a light weight backend, or we should probably support plugging into multiple backends.

What's the status of this REP? Do you have any plans for the implementation of this REP?

Yes, I have an initial working version of this in review, but we need to still verify if there are any performance regressions

MissiontoMars added 3 commits June 13, 2024 20:03

init

9e0c93a

refine

f3273ae

refine

add47a4

MissiontoMars requested a review from nikitavemuri August 7, 2024 02:00