Skip to content

Commit

Permalink
OLLAMA_HOST support (#821)
Browse files Browse the repository at this point in the history
* handling OLLAMA_HOST

* docs: ✏️ add note on GenAIScript using Ollama API

* added ollama docker command

* updated docs
  • Loading branch information
pelikhan authored Nov 4, 2024
1 parent 475a49b commit 2417113
Show file tree
Hide file tree
Showing 7 changed files with 125 additions and 32 deletions.
5 changes: 3 additions & 2 deletions .github/workflows/ollama.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,5 +32,6 @@ jobs:
run: docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
- name: run summarize-ollama-phi3
run: yarn test:summarize --model ollama:phi3.5 --out ./temp/summarize-ollama-phi3
# - name: run vector-search
# run: yarn run:script vector-search --model ollama:phi3 --out ./temp/rag
env:
OLLAMA_HOST: "http://localhost:11434"

40 changes: 23 additions & 17 deletions docs/src/content/docs/getting-started/configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -720,12 +720,7 @@ script({

</Steps>

## Local Models

There are many projects that allow you to run models locally on your machine,
or in a container.

### LocalAI
## LocalAI

[LocalAI](https://localai.io/) act as a drop-in replacement REST API that’s compatible
with OpenAI API specifications for local inferencing. It uses free Open Source models
Expand Down Expand Up @@ -758,14 +753,20 @@ OPENAI_API_TYPE=localai

</Steps>

### Ollama
## Ollama

[Ollama](https://ollama.ai/) is a desktop application that let you download and run model locally.

Running tools locally may require additional GPU resources depending on the model you are using.

Use the `ollama` provider to access Ollama models.

:::note

GenAIScript is currently using the OpenAI API compatibility layer of Ollama.

:::

<Steps>

<ol>
Expand Down Expand Up @@ -795,6 +796,18 @@ GenAIScript will automatically pull the model, which may take some time dependin

</li>

<li>

If Ollama runs on a server or a different computer or on a different port,
you have to configure the `OLLAMA_HOST` environment variable to connect to a remote Ollama server.

```txt title=".env"
OLLAMA_HOST=https://<IP or domain>:<port>/ # server url
OLLAMA_HOST=0.0.0.0:12345 # different port
```

</li>

</ol>

</Steps>
Expand All @@ -817,21 +830,14 @@ script({
})
```

If Ollama runs on a server or a different computer, you have to configure the `OLLAMA_API_BASE` environment variable.

```txt OLLAMA_API_BASE
OLLAMA_API_BASE=http://<IP or domain>:<port>/v1
```
As GenAIScript uses OpenAI style api, you must use the `/v1` endpoints and not `/api`.

### Llamafile
## Llamafile

[https://llamafile.ai/](https://llamafile.ai/) is a single file desktop application
that allows you to run an LLM locally.

The provider is `llamafile` and the model name is ignored.

### Jan, LMStudio, LLaMA.cpp
## Jan, LMStudio, LLaMA.cpp

[Jan](https://jan.ai/), [LMStudio](https://lmstudio.ai/),
[LLaMA.cpp](https://github.com/ggerganov/llama.cpp/tree/master/examples/server)
Expand All @@ -855,7 +861,7 @@ OPENAI_API_BASE=http://localhost:...

</Steps>

### Model specific environment variables
## Model specific environment variables

You can provide different environment variables
for each named model by using the `PROVIDER_MODEL_API_...` prefix or `PROVIDER_API_...` prefix.
Expand Down
4 changes: 3 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,9 @@
"prd": "node packages/cli/built/genaiscript.cjs run prd -prd",
"genai": "node packages/cli/built/genaiscript.cjs run",
"upgrade:deps": "zx scripts/upgrade-deps.mjs",
"cli": "node packages/cli/built/genaiscript.cjs"
"cli": "node packages/cli/built/genaiscript.cjs",
"ollama": "docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama",
"ollama:stop": "docker stop ollama && docker rm ollama"
},
"release-it": {
"github": {
Expand Down
32 changes: 21 additions & 11 deletions packages/core/src/connection.ts
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import {
AzureCredentialsType,
} from "./host"
import { parseModelIdentifier } from "./models"
import { parseHostVariable } from "./ollama"
import { normalizeFloat, trimTrailingSlash } from "./util"

export async function parseDefaultsFromEnv(env: Record<string, string>) {
Expand Down Expand Up @@ -276,6 +277,19 @@ export async function parseTokenFromEnv(
}
}

if (provider === MODEL_PROVIDER_OLLAMA) {
const host = parseHostVariable(env)
const base = cleanApiBase(host)
return {
provider,
model,
base,
token: "ollama",
type: "openai",
source: "env: OLLAMA_HOST",
}
}

const prefixes = [
tag ? `${provider}_${model}_${tag}` : undefined,
provider ? `${provider}_${model}` : undefined,
Expand Down Expand Up @@ -307,17 +321,6 @@ export async function parseTokenFromEnv(
}
}

if (provider === MODEL_PROVIDER_OLLAMA) {
return {
provider,
model,
base: OLLAMA_API_BASE,
token: "ollama",
type: "openai",
source: "default",
}
}

if (provider === MODEL_PROVIDER_LLAMAFILE) {
return {
provider,
Expand Down Expand Up @@ -358,6 +361,13 @@ export async function parseTokenFromEnv(
`/openai/deployments`
return b
}

function cleanApiBase(b: string) {
if (!b) return b
b = trimTrailingSlash(b)
if (!/\/v1$/.test(b)) b += "/v1"
return b
}
}

export async function updateConnectionConfiguration(
Expand Down
1 change: 1 addition & 0 deletions packages/core/src/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ export const PROMPT_FENCE = "```"
export const MARKDOWN_PROMPT_FENCE = "`````"

export const OPENAI_API_BASE = "https://api.openai.com/v1"
export const OLLAMA_DEFAUT_PORT = 11434
export const OLLAMA_API_BASE = "http://localhost:11434/v1"
export const LLAMAFILE_API_BASE = "http://localhost:8080/v1"
export const LOCALAI_API_BASE = "http://localhost:8080/v1"
Expand Down
54 changes: 54 additions & 0 deletions packages/core/src/ollama.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
import { describe, test } from "node:test"
import assert from "node:assert/strict"
import { parseHostVariable } from "./ollama"
import { OLLAMA_API_BASE, OLLAMA_DEFAUT_PORT } from "./constants"

describe("parseHostVariable", () => {
test("parses OLLAMA_HOST environment variable correctly", () => {
const env = { OLLAMA_HOST: "http://localhost:3000" }
const result = parseHostVariable(env)
assert.strictEqual(result, "http://localhost:3000/")
})

test("parses OLLAMA_API_BASE environment variable correctly", () => {
const env = { OLLAMA_API_BASE: "http://api.ollama.com" }
const result = parseHostVariable(env)
assert.strictEqual(result, "http://api.ollama.com/")
})

test("falls back to OLLAMA_API_BASE constant if no environment variable is set", () => {
const env = {}
const result = parseHostVariable(env)
assert.strictEqual(result, OLLAMA_API_BASE)
})

test("parses IP address with port correctly", () => {
const env = { OLLAMA_HOST: "192.168.1.1:8080" }
const result = parseHostVariable(env)
assert.strictEqual(result, "http://192.168.1.1:8080")
})

test("parses IP address without port correctly", () => {
const env = { OLLAMA_HOST: "192.168.1.1" }
const result = parseHostVariable(env)
assert.strictEqual(result, `http://192.168.1.1:${OLLAMA_DEFAUT_PORT}`)
})

test("parses 0.0.0.0 with port correctly", () => {
const env = { OLLAMA_HOST: "0.0.0.0:4000" }
const result = parseHostVariable(env)
assert.strictEqual(result, "http://0.0.0.0:4000")
})

test("parses localhost with port correctly", () => {
const env = { OLLAMA_HOST: "localhost:4000" }
const result = parseHostVariable(env)
assert.strictEqual(result, "http://localhost:4000")
})

test("parses 0.0.0.0 without port correctly", () => {
const env = { OLLAMA_HOST: "0.0.0.0" }
const result = parseHostVariable(env)
assert.strictEqual(result, `http://0.0.0.0:${OLLAMA_DEFAUT_PORT}`)
})
})
21 changes: 20 additions & 1 deletion packages/core/src/ollama.ts
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
// Import necessary modules and types for handling chat completions and model management
import { ChatCompletionHandler, LanguageModel, LanguageModelInfo } from "./chat"
import { MODEL_PROVIDER_OLLAMA } from "./constants"
import {
MODEL_PROVIDER_OLLAMA,
OLLAMA_API_BASE,
OLLAMA_DEFAUT_PORT,
} from "./constants"
import { isRequestError } from "./error"
import { createFetch } from "./fetch"
import { parseModelIdentifier } from "./models"
import { OpenAIChatCompletion } from "./openai"
import { LanguageModelConfiguration, host } from "./host"
import { URL } from "url"

/**
* Handles chat completion requests using the Ollama model.
Expand Down Expand Up @@ -105,3 +110,17 @@ export const OllamaModel = Object.freeze<LanguageModel>({
id: MODEL_PROVIDER_OLLAMA,
listModels,
})

export function parseHostVariable(env: Record<string, string>) {
const s = (
env.OLLAMA_HOST ||
env.OLLAMA_API_BASE ||
OLLAMA_API_BASE
)?.trim()
const ipm =
/^(?<address>(localhost|\d+\.\d+\.\d+\.\d+))(:(?<port>\d+))?$/i.exec(s)
if (ipm)
return `http://${ipm.groups.address}:${ipm.groups.port || OLLAMA_DEFAUT_PORT}`
const url = new URL(s)
return url.href
}

0 comments on commit 2417113

Please sign in to comment.