Skip to content

ray crashes frequently on EulerOS #59966

@mjsky481

Description

@mjsky481

What happened + What you expected to happen

When I use ray head to send tasks to worker, the task should be finished properly. However, during the task processing the ray cluster crashes very frequently and transiently.

gcs_server.out.log

Versions / Dependencies

Ray: 2.9.3
Python: 3.10.16
Host machine OS: Huawei Cloud EulerOS 2.0 (aarch64)
Docker container OS: Anolis OS 8.6
Some Python dependencies:
grpc-interceptor==0.15.4
grpcio==1.66.1
grpcio-tools==1.51.1
grpclib==0.4.3

Reproduction script

run normal ray task in Huawei Cloud EulerOS 2.0 (aarch64)

Issue Severity

High: It blocks me from completing my task.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething that is supposed to be working; but isn'tcommunity-backlogcoreIssues that should be addressed in Ray CorestabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions