[Bug]: ModuleNotFoundError: No module named 'ray' #854

gizbo · 2024-12-02T18:47:12Z

Your current environment

N/A

🐛 Describe the bug

Hello,
Running the provided quickstart Docker run command, and getting the following error:

INFO: Multiprocessing frontend to use
ipc:///tmp/3f2ae52b-cfde-4764-ad60-361c1c2ced18 for RPC Path.
INFO: Started engine process with PID 57
Process SpawnProcess-1:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/aphrodite/executor/ray_utils.py", line 13, in
import ray
ModuleNotFoundError: No module named 'ray'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.10/dist-packages/aphrodite/endpoints/openai/rpc/server.py", line 214, in run_rpc_server
server = AsyncEngineRPCServer(async_engine_args, rpc_path)
File "/usr/local/lib/python3.10/dist-packages/aphrodite/endpoints/openai/rpc/server.py", line 29, in init
self.engine = AsyncAphrodite.from_engine_args(async_engine_args)
File "/usr/local/lib/python3.10/dist-packages/aphrodite/engine/async_aphrodite.py", line 703, in from_engine_args
engine_config = engine_args.create_engine_config()
File "/usr/local/lib/python3.10/dist-packages/aphrodite/engine/args_tools.py", line 936, in create_engine_config
parallel_config = ParallelConfig(
File "/usr/local/lib/python3.10/dist-packages/aphrodite/common/config.py", line 963, in init
raise ValueError("Unable to load Ray which is "
ValueError: Unable to load Ray which is required for multi-node inference, please install Ray with pip install ray.
No module named 'ray'^CTraceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/dist-packages/aphrodite/endpoints/openai/api_server.py", line 802, in
asyncio.run(run_server(args))
File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/usr/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/usr/lib/python3.10/asyncio/base_events.py", line 1871, in _run_once
event_list = self._selector.select(timeout)
File "/usr/lib/python3.10/selectors.py", line 469, in select
fd_event_list = self._selector.poll(timeout, max_ev)
Thanks

The text was updated successfully, but these errors were encountered:

AlpinDale · 2024-12-02T20:37:59Z

Can you share your Docker command? We should not use Ray unless you launch the engine with --worker-use-ray or --distributed-executor-backend=ray.

gizbo · 2024-12-02T20:44:17Z

Hey, thanks for the quick reply. I've was trying the command from the README.md
Docker
Additionally, we provide a Docker image for easy deployment. Here's a basic command to get you started:

docker run --runtime nvidia --gpus all
-v ~/.cache/huggingface:/root/.cache/huggingface
#--env "CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7"
-p 2242:2242
--ipc=host
alpindale/aphrodite-openai:latest
--model NousResearch/Meta-Llama-3.1-8B-Instruct
--tensor-parallel-size 8
--api-keys "sk-empty"

AlpinDale · 2024-12-03T01:07:05Z

Can you add --distributed-executor-backend=mp to the launch flags?

baditaflorin · 2024-12-12T07:06:36Z

By default getting same error

docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7" \
    -p 2242:2242 \
    --ipc=host \
    alpindale/aphrodite-openai:latest \
    --model NousResearch/Meta-Llama-3.1-8B-Instruct \
    --tensor-parallel-size 8 \
    --api-keys "sk-empty"
Unable to find image 'alpindale/aphrodite-openai:latest' locally
latest: Pulling from alpindale/aphrodite-openai
3c645031de29: Pull complete
0d6448aff889: Pull complete
0a7674e3e8fe: Pull complete
b71b637b97c5: Pull complete
56dc85502937: Pull complete
c1c890480c74: Pull complete
93929e83ed21: Pull complete
0ead3d2f76c1: Pull complete
60cdee2e316d: Pull complete
518f3d7cac80: Pull complete
336c5995c4b2: Pull complete
Digest: sha256:8bac4170be255c19d29d84ffbdeabdc1b0a09ee511bec7ed0026e349db430357
Status: Downloaded newer image for alpindale/aphrodite-openai:latest
INFO:     Multiprocessing frontend to use
ipc:///tmp/535bb624-82bb-42e8-bbc7-5ea63814857e for RPC Path.
INFO:     Started engine process with PID 46
Process SpawnProcess-1:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/aphrodite/executor/ray_utils.py", line 13, in <module>
    import ray
ModuleNotFoundError: No module named 'ray'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.10/dist-packages/aphrodite/endpoints/openai/rpc/server.py", line 214, in run_rpc_server
    server = AsyncEngineRPCServer(async_engine_args, rpc_path)
  File "/usr/local/lib/python3.10/dist-packages/aphrodite/endpoints/openai/rpc/server.py", line 29, in __init__
    self.engine = AsyncAphrodite.from_engine_args(async_engine_args)
  File "/usr/local/lib/python3.10/dist-packages/aphrodite/engine/async_aphrodite.py", line 703, in from_engine_args
    engine_config = engine_args.create_engine_config()
  File "/usr/local/lib/python3.10/dist-packages/aphrodite/engine/args_tools.py", line 936, in create_engine_config
    parallel_config = ParallelConfig(
  File "/usr/local/lib/python3.10/dist-packages/aphrodite/common/config.py", line 963, in __init__
    raise ValueError("Unable to load Ray which is "
ValueError: Unable to load Ray which is required for multi-node inference, please install Ray with `pip install ray`.

with --distributed-executor-backend=mp it seems to work, after i set the CUDA_VISIBLE_DEVICES=0 and --tensor-parallel-size 1

docker run --runtime nvidia --gpus all     -v ~/.cache/huggingface:/root/.cache/huggingface     --env "CUDA_VISIBLE_DEVICES=0"     -p 2242:2242     --ipc=host     alpindale/aphrodite-openai:latest     --model NousResearch/Meta-Llama-3.1-8B-Instruct     --tensor-parallel-size 1     --api-keys "sk-empty" --distributed-executor-backend=mp
INFO:     Multiprocessing frontend to use
ipc:///tmp/6613166f-863d-42db-98cc-5c78ae5f00a4 for RPC Path.
INFO:     Started engine process with PID 44
WARNING:  The model has a long context length (131072). This may cause OOM
errors during the initial memory profiling phase, or result in low performance
due to small KV cache space. Consider setting --max-model-len to a smaller
value.
INFO:
--------------------------------------------------------------------------------
-----
INFO:     Initializing Aphrodite Engine (v0.6.4.post1 commit 20f11fd0) with the
following config:
INFO:     Model = 'NousResearch/Meta-Llama-3.1-8B-Instruct'
INFO:     DataType = torch.bfloat16
INFO:     Tensor Parallel Size = 1
INFO:     Pipeline Parallel Size = 1
INFO:     Disable Custom All-Reduce = False
INFO:     Context Length = 131072
INFO:     Enforce Eager Mode = False
INFO:     Prefix Caching = False
INFO:     Device = device(type='cuda')
INFO:     Guided Decoding Backend =
DecodingConfig(guided_decoding_backend='lm-format-enforcer')
INFO:
--------------------------------------------------------------------------------
-----
WARNING:  Reducing Torch parallelism from 12 threads to 1 to avoid unnecessary
CPU contention. Set OMP_NUM_THREADS in the external environment to tune this
value as needed.
INFO:     Loading model NousResearch/Meta-Llama-3.1-8B-Instruct...
INFO:     Using model weights format ['*.safetensors']

gizbo added the bug Something isn't working label Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: ModuleNotFoundError: No module named 'ray' #854

[Bug]: ModuleNotFoundError: No module named 'ray' #854

gizbo commented Dec 2, 2024

AlpinDale commented Dec 2, 2024

gizbo commented Dec 2, 2024

AlpinDale commented Dec 3, 2024

baditaflorin commented Dec 12, 2024

[Bug]: ModuleNotFoundError: No module named 'ray' #854

[Bug]: ModuleNotFoundError: No module named 'ray' #854

Comments

gizbo commented Dec 2, 2024

Your current environment

🐛 Describe the bug

AlpinDale commented Dec 2, 2024

gizbo commented Dec 2, 2024

AlpinDale commented Dec 3, 2024

baditaflorin commented Dec 12, 2024