使用 Ollama 快速运行大型语言模型
curl -fsSL https://ollama.com/install.sh | sh
官方的 Ollama 的 Docker 镜像 ollama/ollama
可在 Docker Hub 上获取。
使用命令 ollama run llama3
运行并和 Llama 3 模型进行对话。
ollama run llama3
Ollama支持的模型列表可以在 ollama.com/library 查看
以下是一些可供下载的模型示例
Model | Parameters | Size | Download |
---|---|---|---|
Llama 3 | 8B | 4.7GB | ollama run llama3 |
Llama 3 | 70B | 40GB | ollama run llama3:70b |
Phi 3 Mini | 3.8B | 2.3GB | ollama run phi3 |
Phi 3 Medium | 14B | 7.9GB | ollama run phi3:medium |
Gemma | 2B | 1.4GB | ollama run gemma:2b |
Gemma | 7B | 4.8GB | ollama run gemma:7b |
Mistral | 7B | 4.1GB | ollama run mistral |
Moondream 2 | 1.4B | 829MB | ollama run moondream |
Neural Chat | 7B | 4.1GB | ollama run neural-chat |
Starling | 7B | 4.1GB | ollama run starling-lm |
Code Llama | 7B | 3.8GB | ollama run codellama |
Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
LLaVA | 7B | 4.5GB | ollama run llava |
Solar | 10.7B | 6.1GB | ollama run solar |
请注意:您至少需要 8GB 的 RAM 来运行 7B 模型,运行 13B 模型需要 16GB,以及运行 33B 模型需要 32GB。
Ollama 支持在 Modelfile 中导入 GGUF 模型:
-
创建一个名为
Modelfile
的文件,其中包含一条指令,其中包含要导入的模型的本地文件路径FROM ./vicuna-33b.Q4_0.gguf
-
在 Ollama 中创建模型
ollama create example -f Modelfile
-
运行模型
ollama run example
有关详细信息,请参阅有关导入模型的 指南。
Ollama 库中的模型可以对 prompt 进行定制。例如,基于模型 llama3
自定义:
ollama pull llama3
创建 Modelfile
:
FROM llama3
# 将温度设置为 1 [越高越有创意,越低越连贯]
PARAMETER temperature 1
# 设置系统消息
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""
接下来,创建并运行模型:
ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.
有关更多示例,请参阅 examples 目录。有关使用 Modelfile 的详细信息,请参阅 Modelfile 文档
ollama create
用于从 Modelfile 创建模型
ollama create mymodel -f ./Modelfile
ollama pull llama3
此命令还可用于更新本地模型。对于本地已存在的模型只会拉取差异部分。
ollama rm llama3
ollama cp llama3 my-model
对于多行输入,您可以使用以下命令换行文本:"""
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
>>> What's in this image? /Users/jmorgan/Desktop/smile.png
The image features a yellow smiley face, which is likely the central focus of the picture.
$ ollama run llama3 "Summarize this file: $(cat README.md)"
Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
ollama show llama3
ollama list
当您想要在不运行桌面应用程序的情况下启动 Ollama 时,您可以使用 ollama serve
更多信息请参阅 开发者指南
接下来,启动服务器:
./ollama serve
最后,在单独的 shell 中,运行一个模型:
./ollama run llama3
Ollama 有用于运行和管理模型的 REST API。
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt":"Why is the sky blue?"
}'
curl http://localhost:11434/api/chat -d '{
"model": "llama3",
"messages": [
{ "role": "user", "content": "why is the sky blue?" }
]
}'
关于所有端点请参阅 API 文档
- Open WebUI
- Enchanted (macOS native)
- Hollama
- Lollms-Webui
- LibreChat
- Bionic GPT
- HTML UI
- Saddle
- Chatbot UI
- Chatbot UI v2
- Typescript UI
- Minimalistic React UI for Ollama Models
- Ollamac
- big-AGI
- Cheshire Cat assistant framework
- Amica
- chatd
- Ollama-SwiftUI
- Dify.AI
- MindMac
- NextJS Web Interface for Ollama
- Msty
- Chatbox
- WinForm Ollama Copilot
- NextChat with Get Started Doc
- Alpaca WebUI
- OllamaGUI
- OpenAOE
- Odin Runes
- LLM-X (Progressive Web App)
- AnythingLLM (Docker + MacOs/Windows/Linux native app)
- Ollama Basic Chat: Uses HyperDiv Reactive UI
- Ollama-chats RPG
- QA-Pilot (Chat with Code Repository)
- ChatOllama (Open Source Chatbot based on Ollama with Knowledge Bases)
- CRAG Ollama Chat (Simple Web Search with Corrective RAG)
- RAGFlow (Open-source Retrieval-Augmented Generation engine based on deep document understanding)
- StreamDeploy (LLM Application Scaffold)
- chat (chat web app for teams)
- Lobe Chat with Integrating Doc
- Ollama RAG Chatbot (Local Chat with multiple PDFs using Ollama and RAG)
- BrainSoup (Flexible native client with RAG & multi-agent automation)
- macai (macOS client for Ollama, ChatGPT, and other compatible API back-ends)
- Olpaka (User-friendly Flutter Web App for Ollama)
- OllamaSpring (Ollama Client for macOS)
- LLocal.in (Easy to use Electron Desktop Client for Ollama)
- oterm
- Ellama Emacs client
- Emacs client
- gen.nvim
- ollama.nvim
- ollero.nvim
- ollama-chat.nvim
- ogpt.nvim
- gptel Emacs client
- Oatmeal
- cmdh
- ooo
- shell-pilot
- tenere
- llm-ollama for Datasette's LLM CLI.
- typechat-cli
- ShellOracle
- tlm
- podman-ollama
- gollama
- MindsDB (Connects Ollama models with nearly 200 data platforms and apps)
- chromem-go with example
- LangChain and LangChain.js with example
- LangChainGo with example
- LangChain4j with example
- LangChainRust with example
- LlamaIndex
- LiteLLM
- OllamaSharp for .NET
- Ollama for Ruby
- Ollama-rs for Rust
- Ollama-hpp for C++
- Ollama4j for Java
- ModelFusion Typescript Library
- OllamaKit for Swift
- Ollama for Dart
- Ollama for Laravel
- LangChainDart
- Semantic Kernel - Python
- Haystack
- Elixir LangChain
- Ollama for R - rollama
- Ollama for R - ollama-r
- Ollama-ex for Elixir
- Ollama Connector for SAP ABAP
- Testcontainers
- Portkey
- PromptingTools.jl with an example
- LlamaScript
- Raycast extension
- Discollama (Discord bot inside the Ollama discord channel)
- Continue
- Obsidian Ollama plugin
- Logseq Ollama plugin
- NotesOllama (Apple Notes Ollama plugin)
- Dagger Chatbot
- Discord AI Bot
- Ollama Telegram Bot
- Hass Ollama Conversation
- Rivet plugin
- Obsidian BMO Chatbot plugin
- Cliobot (Telegram bot with Ollama support)
- Copilot for Obsidian plugin
- Obsidian Local GPT plugin
- Open Interpreter
- Llama Coder (Copilot alternative using Ollama)
- Ollama Copilot (Proxy that allows you to use ollama as a copilot like Github copilot)
- twinny (Copilot and Copilot chat alternative using Ollama)
- Wingman-AI (Copilot code and chat alternative using Ollama and HuggingFace)
- Page Assist (Chrome Extension)
- AI Telegram Bot (Telegram bot using Ollama in backend)
- AI ST Completion (Sublime Text 4 AI assistant plugin with Ollama support)
- Discord-Ollama Chat Bot (Generalized TypeScript Discord Bot w/ Tuning Documentation)
- Discord AI chat/moderation bot Chat/moderation bot written in python. Uses Ollama to create personalities.
- Headless Ollama (Scripts to automatically install ollama client & models on any OS for apps that depends on ollama server)
- Ollama 由 Georgi Gerganov 所创建的项目 llama.cpp 强力驱动。