English | 中文
Note
About This Fork This project is forked from ericc-ch/copilot-api. Since the original author has discontinued maintenance and no longer supports the new API, we have redesigned and rewritten it. Special thanks to @ericc-ch for the original work and contribution!
Warning
This is a reverse-engineered proxy of GitHub Copilot API. It is not supported by GitHub, and may break unexpectedly. Use at your own risk.
Warning
GitHub Security Notice:
Excessive automated or scripted use of Copilot (including rapid or bulk requests, such as via automated tools) may trigger GitHub's abuse-detection systems.
You may receive a warning from GitHub Security, and further anomalous activity could result in temporary suspension of your Copilot access.
GitHub prohibits use of their servers for excessive automated bulk activity or any activity that places undue burden on their infrastructure.
Please review:
Use this proxy responsibly to avoid account restrictions.
Note: If you are using opencode, you do not need this project. Opencode supports GitHub Copilot provider out of the box.
A reverse-engineered proxy for the GitHub Copilot API that exposes OpenAI-compatible, Anthropic-compatible, and Gemini-compatible interfaces. The gateway routes by model supported_endpoints capabilities and performs protocol translation when needed, so clients using OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, or Gemini generateContent-style calls can all work with the same backend (including Claude Code).
The project currently works as a capability-driven routing gateway, not a single-path passthrough proxy:
- It exposes OpenAI / Anthropic / Gemini-compatible ingress endpoints.
- It selects upstream endpoint paths dynamically from model
supported_endpoints. - Ingress protocol and final upstream protocol may differ (with bidirectional format translation).
flowchart TB
subgraph Clients["Clients"]
C1[OpenAI-compatible clients]
C2[Anthropic-compatible clients]
C3[Gemini-compatible clients]
end
subgraph Proxy["copilot-api"]
direction TB
subgraph Ingress["Ingress"]
I1["/v1/chat/completions"]
I2["/v1/messages"]
I3["/v1/responses"]
I4["/v1beta/models/{model}:generateContent<br/>:streamGenerateContent"]
end
subgraph Router["Capability-driven routing"]
R1[Route by supported_endpoints]
end
subgraph Upstream["Copilot upstream endpoints"]
U1["/chat/completions"]
U2["/v1/messages"]
U3["/responses"]
end
subgraph Admin["Management & state"]
A1["/admin"]
A2["/usage"]
A3["/token"]
A4[config.json + runtime state]
end
end
C1 --> I1
C2 --> I2
C3 --> I4
I1 --> R1
I2 --> R1
I3 --> R1
I4 --> R1
R1 --> U1
R1 --> U2
R1 --> U3
- If model supports messages -> use
/v1/messages - Else if model supports responses -> translate and use
/responses - Else -> translate and use
/chat/completions
- If model supports chat -> use
/chat/completions - Else if model supports messages -> fallback to
/v1/messages - Else if model supports responses -> fallback to
/responses - If model declares
supported_endpointsbut none match -> return 400 - If endpoint metadata is missing/empty -> default to chat path
- Allowed only when model supports responses
- If not supported -> direct 400 (no multi-endpoint fallback)
- Fixed chat-only design: always translate Gemini request to
/chat/completions - Execution order is: validate model capability first, then transform Gemini -> Chat payload
- If chat is not supported -> direct 400 (no messages/responses fallback)
- Currently text-input only via
contents.parts.text
- Multi-protocol ingress: OpenAI Chat, OpenAI Responses, Anthropic Messages, and Gemini-compatible endpoints.
- Capability-driven routing: dynamically route by model
supported_endpoints, without hardcoded model-name routing. - Bidirectional translation layer: Anthropic <-> Chat, Anthropic <-> Responses, and Chat <-> Gemini-compatible translations.
- Web account management: add and manage multiple GitHub accounts from
/admin. - Multi-account support: switch active accounts without restarting the server.
- Docker-first deployment: container-focused deployment with persistent configuration.
- Usage monitoring: inspect usage and quota from
/usage. - Rate-limit controls: configurable throttling and wait strategy.
- Account-type support: individual / business / enterprise plans.
# Start the server
docker compose up -d
# View logs
docker compose logs -fThen visit http://localhost:4141/admin to add your GitHub account.
docker run -d \
--name copilot-api \
-p 4141:4141 \
-v copilot-data:/data \
--restart unless-stopped \
ghcr.io/qlhazycoder/copilot-api:latest- Start the server using Docker
- Open http://localhost:4141/admin in your browser (must be accessed from localhost)
- Click "Add Account" to start the GitHub OAuth device flow
- Enter the code shown on GitHub's device authorization page
- Your account will be automatically configured once authorized
The admin panel includes five tabs: Accounts, Models, Usage, Model Mappings, and Settings.
- Add/switch/remove/reorder multiple GitHub accounts.
- The Accounts page refreshes account status and usage on a polling cycle (near real-time, not websocket push).
- Usage is fetched per account token.
- Grouped model display by provider.
- Visible/hidden filtering and visibility management mode.
- Double-click inline editing for premium multipliers (used in local usage-log accounting).
- Per-model default reasoning effort configuration: dynamically displays options based on model's supported reasoning levels, auto-fills when clients omit reasoning fields.
- Model cards display feature tags and context window metadata.
- Usage overview cards plus request log list.
- Logs are isolated by active account (no cross-account mixing).
- Supports
sourcefiltering (all/request) and cursor pagination;endpointis currently display-only, not an independent filter. - Configurable usage test/poll interval; default interval comes from config (default 10 minutes), and the test request uses
gpt-4o. - Monthly cleanup is lazy-on-write (cleanup runs when new logs are appended), not an exact cron trigger.
- Add, copy, and delete model mappings.
- Map client-facing aliases to actual Copilot models.
- Target model options can be loaded dynamically from
/v1/models.
- Edit global rate-limit and related admin settings (env vars still take precedence).
- Includes Usage test interval configuration.
| Variable | Default | Description |
|---|---|---|
PORT |
4141 |
Server port |
VERBOSE |
false |
Enable verbose logging (also accepts DEBUG=true) |
RATE_LIMIT |
- | Minimum seconds between requests |
RATE_LIMIT_WAIT |
false |
Wait instead of error when rate limit is hit |
SHOW_TOKEN |
false |
Display tokens in logs |
PROXY_ENV |
false |
Use HTTP_PROXY/HTTPS_PROXY from environment |
services:
copilot-api:
image: ghcr.io/qlhazycoder/copilot-api:latest
container_name: copilot-api
ports:
- "4141:4141"
volumes:
- copilot-data:/data
environment:
- PORT=4141
- VERBOSE=true
- RATE_LIMIT=5
- RATE_LIMIT_WAIT=true
restart: unless-stopped
volumes:
copilot-data:If RATE_LIMIT / RATE_LIMIT_WAIT are not set via environment variables, you can configure them from the admin page's Settings tab. Environment variables take precedence over the saved web settings.
| Endpoint | Method | Description |
|---|---|---|
/v1/responses |
POST |
OpenAI Responses API for model responses (available only for responses-capable models) |
/v1/chat/completions |
POST |
Chat completions API (with capability-driven fallback) |
/v1/models |
GET |
List available models |
/v1/embeddings |
POST |
Create text embeddings |
Also available as compatibility aliases without /v1: /chat/completions, /responses, /models, /embeddings.
| Endpoint | Method | Description |
|---|---|---|
/v1/messages |
POST |
Anthropic Messages API (with capability-driven fallback) |
/v1/messages/count_tokens |
POST |
Token counting |
| Endpoint | Method | Description |
|---|---|---|
/v1beta/models/{model}:generateContent |
POST |
Gemini-compatible non-stream ingress, internally fixed to /chat/completions |
/v1beta/models/{model}:streamGenerateContent |
POST |
Gemini-compatible stream ingress, internally fixed to /chat/completions and returned over SSE |
Note: Gemini ingress is currently chat-only and text-input focused (contents.parts.text); if the model does not support chat, it returns 400 directly.
| Endpoint | Method | Description |
|---|---|---|
/admin |
GET |
Account management Web UI (localhost only) |
/usage |
GET |
Copilot usage statistics and quota |
/token |
GET |
Current Copilot token |
This project does not implement a full Claude Code / Codex tool protocol compatibility layer. Tool support is currently best-effort and limited to the tool shapes that GitHub Copilot accepts reliably.
- Well-supported: standard
functiontools passed through OpenAI-compatible or Anthropic-compatible requests. - Built-in Responses tools: support exists for Copilot/OpenAI-style built-in tools such as
web_search,web_search_preview,file_search,code_interpreter,image_generation, andlocal_shellwhen the upstream model/endpoint supports them. - Special compatibility: custom
apply_patchis normalized into afunctiontool for better compatibility. - Limited file editing compatibility: common custom file-editing tool names such as
write,write_file,writefiles,edit,edit_file,multi_edit, andmultieditare normalized intofunctiontools so they are not dropped immediately by the proxy. - Not guaranteed: skill-specific tools used by Claude Code, Codex,
superpowers, or other agent frameworks may still fail if they depend on client-specific schemas, result formats, or tool execution semantics that Copilot does not support upstream. - Current limitation: this proxy does not yet provide a complete end-to-end compatibility layer for all Claude Code or Codex file tools. If a skill depends on a proprietary tool contract, additional adapter work is still required.
Configure Claude Code to use this proxy by creating a .claude/settings.json file:
{
"env": {
"ANTHROPIC_BASE_URL": "http://localhost:4141",
"ANTHROPIC_AUTH_TOKEN": "sk-xxxx"
},
"model": "opus",
"permissions": {
"deny": ["WebSearch"]
}
}Model selection no longer needs to be hardcoded in .claude/settings.json. Open /admin, switch to the Model Mappings tab, and map Claude Code model aliases to the actual Copilot models you want to use.
This is the recommended way to route haiku, sonnet, opus, dated Claude model IDs, or any other client-facing model name without changing local Claude Code settings each time.
More options: Claude Code settings
If you want Claude Code to inject an extra marker during the SubagentStart hook so copilot-api can more reliably distinguish initiator overrides, you can install the optional plugin directly from this repository:
/plugin marketplace add https://github.com/QLHazyCoder/copilot-api.git
/plugin install copilot-api-subagent-marker@copilot-api-marketplaceThis plugin is only a lightweight hook helper. It does not start or manage the copilot-api service itself, which should still be deployed separately via Docker as described above.
The configuration file is stored at /data/copilot-api/config.json inside the container (persisted via Docker volume).
{
"accounts": [
{
"id": "12345",
"login": "github-user",
"avatarUrl": "https://...",
"token": "gho_xxxx",
"accountType": "individual",
"createdAt": "2025-01-27T..."
}
],
"activeAccountId": "12345",
"extraPrompts": {
"gpt-5-mini": "<exploration prompt>"
},
"smallModel": "gpt-5-mini",
"modelReasoningEfforts": {
"gpt-5-mini": "xhigh"
}
}| Key | Description |
|---|---|
accounts |
List of configured GitHub accounts |
activeAccountId |
Currently active account ID |
extraPrompts |
Per-model prompts appended to system messages |
smallModel |
Fallback model for warmup requests (default: gpt-5-mini) |
modelReasoningEfforts |
Per-model reasoning effort (none, minimal, low, medium, high, xhigh) |
modelMapping |
Alias mapping rules (persisted from Admin Model Mappings) |
premiumModelMultipliers |
Premium accounting multipliers per model |
modelCardMetadata |
Extended model-card metadata (e.g. context window / features) |
hiddenModels |
Models hidden in the Admin UI |
useFunctionApplyPatch |
Whether apply_patch is normalized to a function tool (enabled by default) |
rateLimitSeconds |
Saved global minimum interval between requests when RATE_LIMIT env is not set |
rateLimitWait |
Saved wait behavior when rate limit is hit and RATE_LIMIT_WAIT env is not set |
usageTestIntervalMinutes |
Usage test/poll interval in minutes (can be null) |
- Bun >= 1.2.x
- GitHub account with Copilot subscription
# Install dependencies
bun install
# Start development server (with hot reload)
bun run dev
# Type checking
bun run typecheck
# Linting
bun run lint
bun run lint --fix
# Run tests
bun test
# Production build
bun run build
# Check for unused code
bun run knip- Rate Limiting: Use
RATE_LIMITto prevent hitting GitHub's rate limits. SetRATE_LIMIT_WAIT=trueto queue requests instead of returning errors. - Business/Enterprise Accounts: The account type is automatically detected during OAuth flow.
- Multiple Accounts: Add multiple accounts via
/adminand switch between them as needed.
- Premium interaction counts come from Copilot/GitHub, not from this proxy inventing its own billing model. The
/usageendpoint simply exposes the upstream Copilot usage data. - Skill, hook, plan, and subagent workflows may increase
premium_interactions. When a client uses features such as Claude Code subagents orsuperpowers, Copilot may treat the parent interaction and subagent interaction as separate billable interactions. - Warmup requests may also count upstream. This project already tries to reduce the impact by routing some warmup-style requests to
smallModel, but it cannot fully control how Copilot accounts for them. - This is not fully fixable at the proxy layer. The proxy can normalize some message shapes to reduce accidental over-counting, but it cannot override Copilot's upstream interaction accounting.
- If you see an increase while using subagents, that does not necessarily mean the proxy sent duplicate business requests. In the normal request path, the proxy forwards a single upstream request per chosen endpoint, but Copilot may still count multiple interactions for the overall workflow.
Please include the following in CLAUDE.md (for Claude usage):
- Prohibited from directly asking questions to users, MUST use AskUserQuestion tool.
- Once you can confirm that the task is complete, MUST use AskUserQuestion tool to make user confirm. The user may respond with feedback if they are not satisfied with the result, which you can use to make improvements and try again.







