examples: Add test run for lv2v #283

victorges · 2024-11-20T16:14:20Z

No description provided.

Wasn't able to run due to go environment but it should be close to working

…vepeer/ai-worker into feat/test-live-video-to-video

…t/test-live-video-to-video

victorges

sending some questions from as far as i got, i'll finish this tmr!

victorges · 2024-12-06T16:28:12Z

.github/workflows/ai-runner-pipelines-fps-test.yaml

+      - name: Build Docker images
+        env:
+          MODEL_ID: ${{ matrix.model_config.id }}
+        run: |
+          cd runner
+          docker build -t livepeer/ai-runner:live-base -f docker/Dockerfile.live-base .
+          if [ "${MODEL_ID}" = "noop" ]; then
+              docker build -t livepeer/ai-runner:live-app-noop -f docker/Dockerfile.live-app-noop .
+          else
+              docker build -t livepeer/ai-runner:live-base-${MODEL_ID} -f docker/Dockerfile.live-base-${MODEL_ID} .
+              docker build -t livepeer/ai-runner:live-app-${MODEL_ID} -f docker/Dockerfile.live-app__PIPELINE__ --build-arg PIPELINE=${MODEL_ID} .
+          fi
+          cd ..


Hmm I think this will make the test take waaaay longer like 30 minutes.

Some ideas:

Somehow run this test after the separate docker build workflow, which also has some optimizations to try only building what's necessary (like it skips the base images if no changes). Ideally we wouldn't have it in the same workflow file tho, but I'm not sure if it's possible to make cross-workflow dependencies.

Simplify the build here always skipping the base. Meaning that we would only copy the app code into the specific pipeline base image that has been published to docker hub (docker would automatically pull the base). This is less perfect in the sense that, when we do change the base images, it won't pick up the change, but at least it won't take 30 minutes each to run.

Maybe a little more sophisticated would be if we could start by first pulling the app image from dockerhub. Then all these builds would be optimized using the docker layer caches. Not sure if this would work tho, I've seen the cache not working even after having pulled base :(

if we use the same node each time, which is the case, the layers are cached anyway right? but yeah if docker build workflow already exists then can just try to use that

i've added base because we have to account for changes in req packages -- maybe can only run base docker build only when dockerfiles or req files are updated

i think its important that there is caching, otherwise any optimizations won't cut it

maybe can only run base docker build only when dockerfiles or req files are updated

Yeah we do that on the docker build workflow. But it's kinda of a pain TBH, pretty complex, and it hurts even more to repeat all of it here 💀
Would be great if we could reuse somehow.

with the self-hosted runners, we clean up disk frequently, so that we dont run out of disk space for other jobs.

this means the intermediate layers etc. are also gone. as victor mentioned above (and i mentioned in the discord thread), best would be to trigger this workflow after docker build has finished and pull those images for testing.

@hjpotter92 can we not wipe the models dir if possible? that would save a lot of time, can just link the persistent path to the req path
@victorges thoughts?

victorges · 2024-12-06T16:29:48Z

.github/workflows/ai-runner-pipelines-fps-test.yaml

+    strategy:
+      max-parallel: 1
+      matrix:
+        model_config:


We should use pipeline instead of model everywhere in this file. It is only called model on the legacy code since we reused the existing model_id param to change which live pipeline to run, but ideally we keep that "misnaming" on the minimum places possible (and always with a comment disclaimer)

btw for more context: https://discord.com/channels/423160867534929930/1308107200522227712/1315767248710930465

victorges · 2024-12-06T16:33:36Z

.github/workflows/ai-runner-pipelines-fps-test.yaml

+      - name: Clean up runner dockers
+        run: |
+          docker ps -aq --filter name=live-video-to-video_${MODEL_ID}* | xargs -r docker rm -f || true


Do these leak? 😳

can't deploy different dockers with the same open ports, live-video-to-video uses 8900

yeah but why would there be a running docker when the workflow starts?

the same workflow runs for all the models (noop, livepotrait, etc) one after the other, after each model run have to delete the docker

ah ok got it

victorges

LGTM!

victorges · 2024-12-10T22:53:28Z

cmd/examples/live-video-to-video/main.go

+                elapsed := time.Since(startTime)
+                sleepTime := interval - elapsed
+                if sleepTime > 0 {
+                    time.Sleep(sleepTime)


No need to refactor, but just a tip for the future the idiomatic way of doing this in go would be with a time.Ticker. You'd just have another select case for it here and you'd get consistent iterations every interval

victorges · 2024-12-10T22:57:42Z

cmd/examples/live-video-to-video/main.go

Great stuff!

victorges · 2024-12-10T22:58:24Z

image.png

is this used? thought we only had the example_data now

yeah my bad, was part of testing

victorges · 2024-12-10T22:59:33Z

cmd/examples/live-video-to-video/main.go

+    optimizationFlags := worker.OptimizationFlags{
+        "STREAM_PROTOCOL": "zeromq",
+    }


I was so confused. I always thought these "optimization flags" were something for docker not the app lol

victorges · 2024-12-10T23:03:33Z

runner/docker/Dockerfile.live-app-noop

@@ -18,7 +18,7 @@ RUN pyenv install $PYTHON_VERSION && \
 ARG PIP_VERSION=23.3.2

 # Install any additional Python packages
-RUN pip install uvicorn==0.30.0 fastapi==0.111.0 numpy==1.26.4 torch==2.5.1 diffusers==0.30.0 transformers==4.43.3 aiohttp==3.10.9 pyzmq==26.2.0
+RUN pip install uvicorn==0.30.0 fastapi==0.111.0 numpy==1.26.4 torch==2.5.1 diffusers==0.30.0 transformers==4.43.3 aiohttp==3.10.9 pyzmq==26.2.0 pynvml==12.0.0


I already fixed this on main, you can revert this here!

(other changes are from `prettierd` hook)

…t/test-live-video-to-video

victorges

LGTM

examples: Add possible example for lv2v

1a7c557

Wasn't able to run due to go environment but it should be close to working

victorges changed the title ~~examples: Add possible example for lv2v~~ examples: Add test run for lv2v Nov 20, 2024

varshith15 and others added 16 commits November 21, 2024 08:03

Merge branch 'feat/test-live-video-to-video' of https://github.com/li…

2225784

…vepeer/ai-worker into feat/test-live-video-to-video

go.mod: Update pkg/errors to fix build

1b3ed77

feat: noop webcam lv2v example

5835437

Merge branch 'feat/test-live-video-to-video' of https://github.com/li…

347722e

…vepeer/ai-worker into feat/test-live-video-to-video

temp: dummy trickle client

3c6489d

temp: docker network host

47feacb

feat: example for lv2v noop with zmq

3c2def5

fix: stream protocol as param

4810532

fix: zmq bind change

fc26de1

feat: fps monitor init

9d47065

fix: monitor to async, revert stream_protocol param,

b7e4a39

fix: kafka revert, ci added for noop

6cfaf4f

Merge branch 'main' into feat/test-live-video-to-video

aaf8338

Merge branch 'main' of https://github.com/livepeer/ai-worker into fea…

0d23630

…t/test-live-video-to-video

fix: working ci test

81da469

fix: possible caching issue

6c51f8e

varshith15 marked this pull request as ready for review December 5, 2024 20:39

varshith15 requested a review from rickstaa as a code owner December 5, 2024 20:39

varshith15 added 3 commits December 6, 2024 17:01

fix: clean up

0ffc438

fix: remove client resizing

5b087ef

feat: comfyui test

d89d4e6

victorges commented Dec 10, 2024

View reviewed changes

hjpotter92 and others added 5 commits December 11, 2024 22:13

workflow: Switch over to self-hosted gpu runner (#345)

a3217a3

(other changes are from `prettierd` hook)

Merge branch 'main' into feat/test-live-video-to-video

bc50ddb

fix: review fixes

eb14c69

fix: run on the same node

e7fc022

fix: workflows

e5123b6

varshith15 added 6 commits December 12, 2024 10:27

fix: add symlink

88bd756

revert: workflow interlink

dfd732e

Merge branch 'main' of https://github.com/livepeer/ai-worker into fea…

a5c973e

…t/test-live-video-to-video

Merge branch 'main' of https://github.com/livepeer/ai-worker into fea…

c2f76e0

…t/test-live-video-to-video

fix: revert fps test

afe82dc

fix: revert ci test

38a5e4a

victorges commented Jan 14, 2025

View reviewed changes

varshith15 merged commit 9485d75 into main Jan 14, 2025
11 checks passed

varshith15 deleted the feat/test-live-video-to-video branch January 14, 2025 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples: Add test run for lv2v #283

examples: Add test run for lv2v #283

victorges commented Nov 20, 2024 •

edited

Loading

victorges left a comment

victorges Dec 6, 2024

varshith15 Dec 10, 2024

victorges Dec 10, 2024

hjpotter92 Dec 11, 2024

varshith15 Dec 11, 2024 •

edited

Loading

victorges Dec 6, 2024

victorges Dec 10, 2024

victorges Dec 6, 2024

varshith15 Dec 10, 2024

victorges Dec 10, 2024

varshith15 Dec 10, 2024

victorges Dec 10, 2024

victorges left a comment

victorges Dec 10, 2024

victorges Dec 10, 2024

victorges Dec 10, 2024

varshith15 Dec 11, 2024

victorges Dec 10, 2024

victorges Dec 10, 2024

victorges left a comment

examples: Add test run for lv2v #283

examples: Add test run for lv2v #283

Conversation

victorges commented Nov 20, 2024 • edited Loading

victorges left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

varshith15 Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

victorges left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

victorges left a comment

Choose a reason for hiding this comment

victorges commented Nov 20, 2024 •

edited

Loading

varshith15 Dec 11, 2024 •

edited

Loading