e2e CI Job #259

danehans · 2025-01-30T04:37:45Z

Create a pre-submit CI job that runs e2e test.

Does the project create a fake model server or use a model that does not require a signed license agreement (xref). @liu-cong suggested using Qwen here but I have yet to test vLLM with LoRA support.
How does the project acquire at least 3 GPU resources?
Will the job run on BM/VM cluster or a single VM with 3 GPUs and use kind?

danehans · 2025-01-30T04:49:28Z

@ahg-g @kfswain IMO this should be required for v0.1.

ahg-g · 2025-01-30T05:03:59Z

I don't think it is required as long as we can run the test manually after each commit.

btw, this requires #244

Jeffwan · 2025-01-30T18:20:06Z

I can help work on the vLLM CPU version. I will give some update soon

joerunde · 2025-01-30T18:33:22Z

I see that there was intent for vllm to support a vllm-cpu release on dockerhub, but I can't seem to find it
vllm-project/vllm#11261

danehans · 2025-01-30T22:47:26Z

I don't think it is required as long as we can run the test manually after each commit.

Okay but manual testing post commit is a bit risky. Adding the CI job should be a high priority post v0.1.

Jeffwan · 2025-02-05T01:14:15Z

I build one, publish it in my personal dockerhub and run it successfully.

The only thing worth attention is the image size is still ~9GB which could introduce No space left issue on CI compute node.

root@129-213-130-206:/home/ubuntu# docker images
REPOSITORY     TAG       IMAGE ID       CREATED          SIZE
vllm-cpu-env   latest    8ddb28e44d97   10 minutes ago   9.19GB

Scripts I used

docker run -it --rm --network=host \
  --cpuset-cpus="0-15" --cpuset-mems="0" \
  -v /home/ubuntu/.cache/huggingface:/root/.cache/huggingface \
  seedjeffwan/vllm-cpu-env:bb392af4-20250203 --model Qwen/Qwen2.5-1.5B-Instruct \
  --enable-lora \
  --lora-modules=lora1=/root/.cache/huggingface/hub/models--ai-blond--Qwen-Qwen2.5-Coder-1.5B-Instruct-lora/snapshots/9cde18d8ed964b0519fb481cca6acd936b2ca811

Following code snippet should be working as well.

docker run -it --rm \
  seedjeffwan/vllm-cpu-env:bb392af4-20250203 --model Qwen/Qwen2.5-1.5B-Instruct \
  --enable-lora \
  --lora-modules=lora1=ai-blond/Qwen-Qwen2.5-Coder-1.5B-Instruct

danehans · 2025-02-06T22:29:00Z

@Jeffwan I can reproduce #259 (comment) but I can't run the qwen model. See this gist for details.

Jeffwan · 2025-02-07T08:25:56Z

@danehans Let me check the gist and see if there's anything I can help with

danehans added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jan 30, 2025

danehans mentioned this issue Feb 6, 2025

Reduced GPU requirements #272

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

e2e CI Job #259

e2e CI Job #259

danehans commented Jan 30, 2025 •

edited

Loading

danehans commented Jan 30, 2025

ahg-g commented Jan 30, 2025 •

edited

Loading

Jeffwan commented Jan 30, 2025 •

edited

Loading

joerunde commented Jan 30, 2025

danehans commented Jan 30, 2025

Jeffwan commented Feb 5, 2025 •

edited

Loading

danehans commented Feb 6, 2025

Jeffwan commented Feb 7, 2025

e2e CI Job #259

e2e CI Job #259

Comments

danehans commented Jan 30, 2025 • edited Loading

danehans commented Jan 30, 2025

ahg-g commented Jan 30, 2025 • edited Loading

Jeffwan commented Jan 30, 2025 • edited Loading

joerunde commented Jan 30, 2025

danehans commented Jan 30, 2025

Jeffwan commented Feb 5, 2025 • edited Loading

danehans commented Feb 6, 2025

Jeffwan commented Feb 7, 2025

danehans commented Jan 30, 2025 •

edited

Loading

ahg-g commented Jan 30, 2025 •

edited

Loading

Jeffwan commented Jan 30, 2025 •

edited

Loading

Jeffwan commented Feb 5, 2025 •

edited

Loading