Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e CI Job #259

Open
danehans opened this issue Jan 30, 2025 · 8 comments
Open

e2e CI Job #259

danehans opened this issue Jan 30, 2025 · 8 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.

Comments

@danehans
Copy link
Contributor

danehans commented Jan 30, 2025

Create a pre-submit CI job that runs e2e test.

  • Does the project create a fake model server or use a model that does not require a signed license agreement (xref). @liu-cong suggested using Qwen here but I have yet to test vLLM with LoRA support.
  • How does the project acquire at least 3 GPU resources?
  • Will the job run on BM/VM cluster or a single VM with 3 GPUs and use kind?
@danehans danehans added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jan 30, 2025
@danehans
Copy link
Contributor Author

@ahg-g @kfswain IMO this should be required for v0.1.

@ahg-g
Copy link
Contributor

ahg-g commented Jan 30, 2025

I don't think it is required as long as we can run the test manually after each commit.

btw, this requires #244

@Jeffwan
Copy link
Contributor

Jeffwan commented Jan 30, 2025

I can help work on the vLLM CPU version. I will give some update soon

@joerunde
Copy link

I see that there was intent for vllm to support a vllm-cpu release on dockerhub, but I can't seem to find it
vllm-project/vllm#11261

@danehans
Copy link
Contributor Author

I don't think it is required as long as we can run the test manually after each commit.

Okay but manual testing post commit is a bit risky. Adding the CI job should be a high priority post v0.1.

@Jeffwan
Copy link
Contributor

Jeffwan commented Feb 5, 2025

I build one, publish it in my personal dockerhub and run it successfully.

The only thing worth attention is the image size is still ~9GB which could introduce No space left issue on CI compute node.

root@129-213-130-206:/home/ubuntu# docker images
REPOSITORY     TAG       IMAGE ID       CREATED          SIZE
vllm-cpu-env   latest    8ddb28e44d97   10 minutes ago   9.19GB

Scripts I used

docker run -it --rm --network=host \
  --cpuset-cpus="0-15" --cpuset-mems="0" \
  -v /home/ubuntu/.cache/huggingface:/root/.cache/huggingface \
  seedjeffwan/vllm-cpu-env:bb392af4-20250203 --model Qwen/Qwen2.5-1.5B-Instruct \
  --enable-lora \
  --lora-modules=lora1=/root/.cache/huggingface/hub/models--ai-blond--Qwen-Qwen2.5-Coder-1.5B-Instruct-lora/snapshots/9cde18d8ed964b0519fb481cca6acd936b2ca811 

Following code snippet should be working as well.

docker run -it --rm \
  seedjeffwan/vllm-cpu-env:bb392af4-20250203 --model Qwen/Qwen2.5-1.5B-Instruct \
  --enable-lora \
  --lora-modules=lora1=ai-blond/Qwen-Qwen2.5-Coder-1.5B-Instruct

Image

Image

Image

@danehans
Copy link
Contributor Author

danehans commented Feb 6, 2025

@Jeffwan I can reproduce #259 (comment) but I can't run the qwen model. See this gist for details.

@Jeffwan
Copy link
Contributor

Jeffwan commented Feb 7, 2025

@danehans Let me check the gist and see if there's anything I can help with

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.
Projects
None yet
Development

No branches or pull requests

4 participants