Summary
Models should be able to declare which file MIME types they accept (e.g. application/pdf, image/*) so the frontend can adapt the upload UI and the backend can deliver files in the provider-native format.
Related issues
Current state
multimodal: true + multimodalAcceptedMimetypes works well for images
- There's no equivalent for documents (PDF, DOCX, etc.)
- Binary files currently get base64-wrapped in XML tags, which most models can't process
Proposal
Add an acceptedFileMimetypes field to the model config:
{
"name": "gpt-4o",
"multimodal": true,
"acceptedFileMimetypes": ["image/*", "application/pdf"]
}
How it works:
- Each model declares which file MIME types it accepts
- The frontend merges
acceptedFileMimetypes with multimodalAcceptedMimetypes to determine which upload options to show
- The endpoint adapter delivers files in the provider-native format (e.g., OpenAI's
file content part for PDFs, image_url for images)
- For models/providers that don't natively handle a file type, the existing text extraction fallback still works
Why this approach:
- Works for any provider — OpenAI, Anthropic, self-hosted via vLLM/Ollama, HF Inference API
- Models that natively support PDFs (GPT-4o, Claude, Gemini) get native handling
- Self-hosted models can still receive extracted text as fallback
- No heavy dependencies (no LibreOffice, no server-side PDF parsing required in core)
- Backward compatible —
supportsBinaryDocs: true can be mapped to acceptedFileMimetypes: [...]
Comparison with other projects:
- LibreChat has a multi-stage file processor pipeline with per-endpoint config
- Open WebUI has pluggable storage backends and document RAG workflows
- This proposal is lighter: trust the model/provider to handle what it declares it supports
I'm preparing PRs for this. Would love feedback from maintainers on the approach before finalizing.
🤖 Generated with Claude Code
Summary
Models should be able to declare which file MIME types they accept (e.g.
application/pdf,image/*) so the frontend can adapt the upload UI and the backend can deliver files in the provider-native format.Related issues
Current state
multimodal: true+multimodalAcceptedMimetypesworks well for imagesProposal
Add an
acceptedFileMimetypesfield to the model config:{ "name": "gpt-4o", "multimodal": true, "acceptedFileMimetypes": ["image/*", "application/pdf"] }How it works:
acceptedFileMimetypeswithmultimodalAcceptedMimetypesto determine which upload options to showfilecontent part for PDFs,image_urlfor images)Why this approach:
supportsBinaryDocs: truecan be mapped toacceptedFileMimetypes: [...]Comparison with other projects:
I'm preparing PRs for this. Would love feedback from maintainers on the approach before finalizing.
🤖 Generated with Claude Code