Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using local Ollama models #201

Open
glenstarchman opened this issue Apr 1, 2024 · 1 comment
Open

Using local Ollama models #201

glenstarchman opened this issue Apr 1, 2024 · 1 comment

Comments

@glenstarchman
Copy link

This is neither a feature request nor a bug but hopefully others may find it useful.

I wanted to experiment with code refactoring using local models but still using the awesome chatgpt-shell. Here is how I got it to work:

;; your ollama endpoint
(setq chatgpt-shell-api-url-base "http://127.0.0.1:11434")

;; models you have pulled for use with ollama
(setq chatgpt-shell-model-versions
      '("gemma:2b-instruct"
        "zephry:latest"
        "codellama:instruct"
        "magicoder:7b-s-cl-q4_0"
        "starcoder:latest"
        "deepseek-coder:1.3b-instruct-q5_1"
        "qwen:1.8b"
        "mistral:7b-instruct"
        "orca-mini:7b"
        "orca-mini:3b"
        "openchat:7b-v3.5-q4_0"))

;; override how chatgpt-shell determines the context length
;; NOTE: use this as a template and adjust as needed
(defun chatgpt-shell--approximate-context-length (model messages)
  "Approximate the context length using MODEL and MESSAGES."
  (let* ((tokens-per-message)
         (max-tokens)
         (original-length (floor (/ (length messages) 2)))
         (context-length original-length))
    ;; Remove "ft:" from fine-tuned models and recognize as usual
    (setq model (string-remove-prefix "ft:" model))
    (cond
     ((string-prefix-p "starcoder" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-3-5
            max-tokens 4096))
     ((string-prefix-p "magicoder" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-3-5
            max-tokens 4096))
     ((string-prefix-p "gemma" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "openchat" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "codellama" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "zephyr" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "qwen" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "deepseek-coder" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "mistral" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     ((string-prefix-p "orca" model)
      (setq tokens-per-message 4
            ;; https://platform.openai.com/docs/models/gpt-4
            max-tokens 8192))
     (t
      (error "Don't know '%s', so can't approximate context length" model)))
    (while (> (chatgpt-shell--num-tokens-from-messages
               tokens-per-message messages)
              max-tokens)
      (setq messages (cdr messages)))
    (setq context-length (floor (/ (length messages) 2)))
    (unless (eq original-length context-length)
      (message "Warning: chatgpt-shell context clipped"))
    context-length))

I have found that the gemma models integrate the best with correct code formatting, etc, but your mileage may vary.

The majority of chatgpt-shell features work and you can even change models with C-c C-v.

@xenodium
Copy link
Owner

xenodium commented Apr 4, 2024

Thanks for this Glen! This is impresive and great to see. I'd been meaning to create a higher-level abstraction that reuses more chatgpt-shell things thank shell-maker https://xenodium.com/a-shell-maker.

I've not had a chance to play with these models. I'm guessing they're also implementing OpenAI's API/schema, which would make reusing more things easier for chatgpt-shell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants