Using local Ollama models #201

glenstarchman · 2024-04-01T18:17:13Z

This is neither a feature request nor a bug but hopefully others may find it useful.

I wanted to experiment with code refactoring using local models but still using the awesome chatgpt-shell. Here is how I got it to work:

;; your ollama endpoint
(setq chatgpt-shell-api-url-base "http://127.0.0.1:11434")

;; models you have pulled for use with ollama
(setq chatgpt-shell-model-versions
      '("gemma:2b-instruct"
        "zephry:latest"
        "codellama:instruct"
        "magicoder:7b-s-cl-q4_0"
        "starcoder:latest"
        "deepseek-coder:1.3b-instruct-q5_1"
        "qwen:1.8b"
        "mistral:7b-instruct"
        "orca-mini:7b"
        "orca-mini:3b"
        "openchat:7b-v3.5-q4_0"))

;; override how chatgpt-shell determines the context length
;; NOTE: use this as a template and adjust as needed
(defun chatgpt-shell--approximate-context-length (model messages)
  "Approximate the context length using MODEL and MESSAGES."
  (let* ((tokens-per-message)
         (max-tokens)
         (original-length (floor (/ (length messages) 2)))
         (context-length original-length))
    ;; Remove "ft:" from fine-tuned models and recognize as usual
    (setq model (string-remove-prefix "ft:" model))
    (cond
     ((string-prefix-p "starcoder" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-3-5
            max-tokens 4096))
     ((string-prefix-p "magicoder" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-3-5
            max-tokens 4096))
     ((string-prefix-p "gemma" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "openchat" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "codellama" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "zephyr" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "qwen" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "deepseek-coder" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "mistral" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "orca" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     (t
      (error "Don't know '%s', so can't approximate context length" model)))
    (while (> (chatgpt-shell--num-tokens-from-messages
               tokens-per-message messages)
              max-tokens)
      (setq messages (cdr messages)))
    (setq context-length (floor (/ (length messages) 2)))
    (unless (eq original-length context-length)
      (message "Warning: chatgpt-shell context clipped"))
    context-length))

I have found that the gemma models integrate the best with correct code formatting, etc, but your mileage may vary.

The majority of chatgpt-shell features work and you can even change models with C-c C-v.

The text was updated successfully, but these errors were encountered:

xenodium · 2024-04-04T13:13:17Z

Thanks for this Glen! This is impresive and great to see. I'd been meaning to create a higher-level abstraction that reuses more chatgpt-shell things thank shell-maker https://xenodium.com/a-shell-maker.

I've not had a chance to play with these models. I'm guessing they're also implementing OpenAI's API/schema, which would make reusing more things easier for chatgpt-shell.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using local Ollama models #201

Using local Ollama models #201

glenstarchman commented Apr 1, 2024

xenodium commented Apr 4, 2024

Using local Ollama models #201

Using local Ollama models #201

Comments

glenstarchman commented Apr 1, 2024

xenodium commented Apr 4, 2024