-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Open
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray CorestabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)Needs triage (eg: priority, bug/not-bug, and owning component)
Description
What happened + What you expected to happen
When I use ray head to send tasks to worker, the task should be finished properly. However, during the task processing the ray cluster crashes very frequently and transiently.
Versions / Dependencies
Ray: 2.9.3
Python: 3.10.16
Host machine OS: Huawei Cloud EulerOS 2.0 (aarch64)
Docker container OS: Anolis OS 8.6
Some Python dependencies:
grpc-interceptor==0.15.4
grpcio==1.66.1
grpcio-tools==1.51.1
grpclib==0.4.3
Reproduction script
run normal ray task in Huawei Cloud EulerOS 2.0 (aarch64)
Issue Severity
High: It blocks me from completing my task.
Metadata
Metadata
Assignees
Labels
bugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray CorestabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)Needs triage (eg: priority, bug/not-bug, and owning component)