Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LocalNet] Add infrastructure to run LLM inference #508

Merged
merged 9 commits into from
May 3, 2024
Merged

Conversation

okdas
Copy link
Member

@okdas okdas commented Apr 27, 2024

Summary

Adds infrastructure to run and develop against LLM on LocalNet.

Issue

Type of change

Select one or more:

  • New feature, functionality or library
  • Bug fix
  • Code health or cleanup
  • Documentation
  • Other (specify)

Testing

Documentation changes (only if making doc changes)

  • make docusaurus_start; only needed if you make doc changes

Local Testing (only if making code changes)

  • Unit Tests: make go_develop_and_test
  • LocalNet E2E Tests: make test_e2e
  • See quickstart guide for instructions

PR Testing (only if making code changes)

  • DevNet E2E Tests: Add the devnet-test-e2e label to the PR.
    • THIS IS VERY EXPENSIVE, so only do it after all the reviews are complete.
    • Optionally run make trigger_ci if you want to re-trigger tests without any code changes
    • If tests fail, try re-running failed tests only using the GitHub UI as shown here

Sanity Checklist

  • I have tested my changes using the available tooling
  • I have commented my code
  • I have performed a self-review of my own code; both comments & source code
  • I create and reference any new tickets, if applicable
  • I have left TODOs throughout the codebase, if applicable

@okdas okdas changed the title [LocalNet] Add infrastructure to run llm inference [LocalNet] Add infrastructure to run LLM inference Apr 27, 2024
@okdas
Copy link
Member Author

okdas commented Apr 27, 2024

Note: this functionality is behind the gate and is turned off by default to avoid downloading and serving an LLM to preserve resources. Turn on ollama in localnet_config.yaml when needed.

The infrastructure by itself works. Can run the request with curl:

kubectl exec "$(tilt get kd validator -ojsonpath='{.status.pods[0].name}')" -- \
curl -X POST http://ollama:11434/v1/chat/completions -H "Content-Type: application/json" \
    -d '{
        "model": "qwen:0.5b",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'

However, it doesn't seem like we support anything but json-rpc at the moment:

// TODO_BLOCKER(@h5law): This function currently only supports JSON-RPC and must

I get the following error:

{"level":"error","error":"got: {\n        \"model\": \"qwen:0.5b\",\n        \"messages\": [\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant.\"\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"Hello!\"\n            }\n        ]\n    }: unrecognised request format in partial payload","service_id":"ollama","message":"failed getting error reply"}
{"level":"error","error":"got: {\n        \"model\": \"qwen:0.5b\",\n        \"messages\": [\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant.\"\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"Hello!\"\n            }\n        ]\n    }: unrecognised request format in partial payload","message":"failed getting request type"}

I suggest we merge this as is to unblock work on other than json-rpc request types.

Btw, I picked qwen:0.5b as it was one of the smallest recent LLMs. We don't get hardware optimizations in that environment, so it makes sense to use the smallest possible. We can go crazy on DevNet, though.

@okdas okdas self-assigned this Apr 27, 2024
@okdas okdas added infra Infra or tooling related improvements, additions or fixes tooling Tooling - CLI, scripts, helpers, off-chain, etc... labels Apr 27, 2024
@okdas okdas added this to the Shannon Private TestNet milestone Apr 27, 2024
@okdas okdas marked this pull request as ready for review April 27, 2024 00:21
@okdas okdas requested a review from Olshansk April 27, 2024 00:22
@Olshansk Olshansk requested a review from red-0ne April 27, 2024 16:33
@Olshansk
Copy link
Member

Great find @okdas.

@red-0ne We'll have to prioritize adding support for gRPC, REST and all the other stuff shortly so we're not limited to just json-rpc.

Copy link
Contributor

@red-0ne red-0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also have suppliers to stake for that service if we want the RelayMiners to run with these configs

@Olshansk
Copy link
Member

@red-0ne With @okdas OOO for the next week, can you update the branch so we can merge it in please?

It'll help unlock development on non json-rpc.

@red-0ne
Copy link
Contributor

red-0ne commented Apr 29, 2024

@Olshansk , I added ollama services to supplier_stake_configs with a small change to the config parser to support lower/upper case rpc type values.

@Olshansk Olshansk mentioned this pull request Apr 29, 2024
8 tasks
Tiltfile Show resolved Hide resolved
Copy link

The CI will now also run the e2e tests on devnet, which increases the time it takes to complete all CI checks. If you just created a pull request, you might need to push another commit to produce a container image DevNet can utilize to spin up infrastructure. You can use make trigger_ci to push an empty commit.

@github-actions github-actions bot added devnet push-image CI related - pushes images to ghcr.io labels Apr 29, 2024
@Olshansk
Copy link
Member

@red-0ne I added this TODO in the code: # TODO(#511): Add support for REST and enabled this.

Assuming E2E tests pass, let's merge it in assuming there are no further changes you deem necessary.

@Olshansk
Copy link
Member

Olshansk commented May 3, 2024

@red-0ne - @okdas helped me figure out the issue with E2E bugs, which I resolved in [1]. Are you okay with approving this so we can merge it in and iterate on REST later?

[1] pokt-network/protocol-infra#18

@Olshansk Olshansk merged commit 3dee9c1 into main May 3, 2024
9 checks passed
bryanchriswhite added a commit that referenced this pull request May 6, 2024
…testutils

* pokt/main:
  [LocalNet] Add infrastructure to run LLM inference (#508)
bryanchriswhite added a commit that referenced this pull request May 6, 2024
…cept

* pokt/main:
  [LocalNet] Add infrastructure to run LLM inference (#508)
bryanchriswhite added a commit that referenced this pull request May 8, 2024
* pokt/main:
  [Code Health] chore: cleanup localnet testutils (#515)
  Zero retryLimit Support in ReplayClient (#442)
  [LocalNet] Add infrastructure to run LLM inference (#508)
  [LocalNet] Documentation for MVT/LocalNet (#488)
  [GATEWAY] Makefile target added to send relays to grove gateway (#487)
  Update README
  [CI] Add GATEWAY_URL envar for e2e tests (#506)
  [Tooling] Add gateway stake/unstake/ logs (#503)
@bryanchriswhite bryanchriswhite removed push-image CI related - pushes images to ghcr.io devnet-test-e2e labels May 16, 2024
@github-actions github-actions bot removed the devnet label May 16, 2024
@Olshansk Olshansk deleted the dk-ollama branch May 29, 2024 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra Infra or tooling related improvements, additions or fixes tooling Tooling - CLI, scripts, helpers, off-chain, etc...
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants