Skip to content

Commit d455817

Browse files
committed
Add musa_simple Dockerfile for supporting Moore Threads GPU
Signed-off-by: Xiaodong Ye <[email protected]>
1 parent 22d51d4 commit d455817

File tree

2 files changed

+46
-4
lines changed

2 files changed

+46
-4
lines changed

docker/README.md

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
### Install Docker Server
2-
> [!IMPORTANT]
2+
> [!IMPORTANT]
33
> This was tested with Docker running on Linux. <br>If you can get it working on Windows or MacOS, please update this `README.md` with a PR!<br>
44
55
[Install Docker Engine](https://docs.docker.com/engine/install)
@@ -16,7 +16,7 @@ docker run --cap-add SYS_RESOURCE -e USE_MLOCK=0 -e MODEL=/var/model/<model-path
1616
where `<model-root-path>/<model-path>` is the full path to the model file on the Docker host system.
1717

1818
### cuda_simple
19-
> [!WARNING]
19+
> [!WARNING]
2020
> Nvidia GPU CuBLAS support requires an Nvidia GPU with sufficient VRAM (approximately as much as the size in the table below) and Docker Nvidia support (see [container-toolkit/install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html)) <br>
2121
2222
A simple Dockerfile for CUDA-accelerated CuBLAS, where the model is located outside the Docker image:
@@ -30,6 +30,21 @@ where `<model-root-path>/<model-path>` is the full path to the model file on the
3030

3131
--------------------------------------------------------------------------
3232

33+
### musa_simple
34+
> [!WARNING]
35+
> Moore Threads GPU MuBLAS support requires an MTT GPU with sufficient VRAM (approximately as much as the size in the table below) and MT CloudNative Toolkits support (see [download](https://developer.mthreads.com/sdk/download/CloudNative)) <br>
36+
37+
A simple Dockerfile for MUSA-accelerated MuBLAS, where the model is located outside the Docker image:
38+
39+
```
40+
cd ./musa_simple
41+
docker build -t musa_simple .
42+
docker run --cap-add SYS_RESOURCE -e USE_MLOCK=0 -e MODEL=/var/model/<model-path> -v <model-root-path>:/var/model -t musa_simple
43+
```
44+
where `<model-root-path>/<model-path>` is the full path to the model file on the Docker host system.
45+
46+
--------------------------------------------------------------------------
47+
3348
### "Open-Llama-in-a-box"
3449
Download an Apache V2.0 licensed 3B params Open LLaMA model and install into a Docker image that runs an OpenBLAS-enabled llama-cpp-python server:
3550
```
@@ -47,7 +62,7 @@ docker $ ls -lh *.bin
4762
lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>q5_1.bin
4863
```
4964

50-
> [!NOTE]
65+
> [!NOTE]
5166
> Make sure you have enough disk space to download the model. As the model is then copied into the image you will need at least
5267
**TWICE** as much disk space as the size of the model:<br>
5368
@@ -60,5 +75,5 @@ lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>q5_
6075
| 65B | 50 GB |
6176

6277

63-
> [!NOTE]
78+
> [!NOTE]
6479
> If you want to pass or tune additional parameters, customise `./start_server.sh` before running `docker build ...`

docker/musa_simple/Dockerfile

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
ARG MUSA_IMAGE="rc3.1.0-devel-ubuntu22.04"
2+
FROM mthreads/musa:${MUSA_IMAGE}
3+
4+
# We need to set the host to 0.0.0.0 to allow outside access
5+
ENV HOST 0.0.0.0
6+
7+
RUN apt-get update && apt-get upgrade -y \
8+
&& apt-get install -y git build-essential \
9+
python3 python3-pip gcc wget \
10+
ocl-icd-opencl-dev opencl-headers clinfo \
11+
libclblast-dev libopenblas-dev \
12+
&& mkdir -p /etc/OpenCL/vendors && cp /driver/etc/OpenCL/vendors/MT.icd /etc/OpenCL/vendors/MT.icd
13+
14+
COPY . .
15+
16+
# setting build related env vars
17+
ENV MUSA_DOCKER_ARCH=all
18+
ENV GGML_MUSA=1
19+
20+
# Install dependencies
21+
RUN python3 -m pip install --upgrade pip pytest cmake scikit-build setuptools fastapi uvicorn sse-starlette pydantic-settings starlette-context
22+
23+
# Install llama-cpp-python (build with musa)
24+
RUN CMAKE_ARGS="-DGGML_MUSA=on" pip install llama-cpp-python
25+
26+
# Run the server
27+
CMD python3 -m llama_cpp.server

0 commit comments

Comments
 (0)