-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LocalNet] Add infrastructure to run LLM inference #508
Conversation
Note: this functionality is behind the gate and is turned off by default to avoid downloading and serving an LLM to preserve resources. Turn on ollama in The infrastructure by itself works. Can run the request with curl: kubectl exec "$(tilt get kd validator -ojsonpath='{.status.pods[0].name}')" -- \
curl -X POST http://ollama:11434/v1/chat/completions -H "Content-Type: application/json" \
-d '{
"model": "qwen:0.5b",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}' However, it doesn't seem like we support anything but json-rpc at the moment: poktroll/pkg/partials/partial.go Line 50 in aba098d
I get the following error: {"level":"error","error":"got: {\n \"model\": \"qwen:0.5b\",\n \"messages\": [\n {\n \"role\": \"system\",\n \"content\": \"You are a helpful assistant.\"\n },\n {\n \"role\": \"user\",\n \"content\": \"Hello!\"\n }\n ]\n }: unrecognised request format in partial payload","service_id":"ollama","message":"failed getting error reply"}
{"level":"error","error":"got: {\n \"model\": \"qwen:0.5b\",\n \"messages\": [\n {\n \"role\": \"system\",\n \"content\": \"You are a helpful assistant.\"\n },\n {\n \"role\": \"user\",\n \"content\": \"Hello!\"\n }\n ]\n }: unrecognised request format in partial payload","message":"failed getting request type"} I suggest we merge this as is to unblock work on other than json-rpc request types. Btw, I picked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also have suppliers
to stake for that service if we want the RelayMiners
to run with these configs
@Olshansk , I added |
The CI will now also run the e2e tests on devnet, which increases the time it takes to complete all CI checks. If you just created a pull request, you might need to push another commit to produce a container image DevNet can utilize to spin up infrastructure. You can use |
@red-0ne I added this TODO in the code: Assuming E2E tests pass, let's merge it in assuming there are no further changes you deem necessary. |
…testutils * pokt/main: [LocalNet] Add infrastructure to run LLM inference (#508)
…cept * pokt/main: [LocalNet] Add infrastructure to run LLM inference (#508)
* pokt/main: [Code Health] chore: cleanup localnet testutils (#515) Zero retryLimit Support in ReplayClient (#442) [LocalNet] Add infrastructure to run LLM inference (#508) [LocalNet] Documentation for MVT/LocalNet (#488) [GATEWAY] Makefile target added to send relays to grove gateway (#487) Update README [CI] Add GATEWAY_URL envar for e2e tests (#506) [Tooling] Add gateway stake/unstake/ logs (#503)
Adds infrastructure to run and develop against LLM on LocalNet. --- Co-authored-by: Redouane Lakrache <[email protected]> Co-authored-by: Daniel Olshansky <[email protected]>
Summary
Adds infrastructure to run and develop against LLM on LocalNet.
Issue
LLM
service in LocalNet/DevNet infra #130Type of change
Select one or more:
Testing
Documentation changes (only if making doc changes)
make docusaurus_start
; only needed if you make doc changesLocal Testing (only if making code changes)
make go_develop_and_test
make test_e2e
PR Testing (only if making code changes)
devnet-test-e2e
label to the PR.make trigger_ci
if you want to re-trigger tests without any code changesSanity Checklist