Skip to content

[BUG] Kimi-K2.6 fails to load after upstream HuggingFace repo removed tokenization_kimi.py #2085

@atlasshrugged1000

Description

@atlasshrugged1000

Describe the bug
Kimi-K2.6 (moonshotai/Kimi-K2.6) fails to start with a download error or AttributeError after Moonshotai pushed commit 81bcaaa to the HuggingFace repo on 2026-05-11, which removed tokenization_kimi.py (the slow tokenizer).

Exo's load_tokenizer_for_model_id in utils_mlx.py explicitly loads this file and requires the TikTokenTokenizer class with its .model tiktoken attribute for _patched_encode. The fast tokenizer (TikTokenTokenizerFast) cannot substitute — it uses a Rust backend, not tiktoken.

To Reproduce
Start exo on any node (source or .app)
Place a Kimi-K2.6 instance
Exo syncs metadata from HuggingFace, tries to download tokenization_kimi.py
File returns 404
Instance enters start→fail→shutdown loop

Error messages
Download phase:
File not found: https://huggingface.co/moonshotai/Kimi-K2.6/resolve/main/tokenization_kimi.py

If download is bypassed, tokenizer load phase:
AttributeError: module 'tokenization_kimi' has no attribute 'TikTokenTokenizer'
Or if fast tokenizer is substituted:
AttributeError: 'NoneType' object has no attribute 'encode'

at utils_mlx.py line 392 in _patched_encode, because hf_tokenizer.model is None on the fast tokenizer.

Root cause
Moonshotai's commit 81bcaaa ("use-fast-tokenizer #38") removed tokenization_kimi.py from the K2.6 repo but:
The file still exists in moonshotai/Kimi-K2-Instruct (same tokenizer, same vocab)
tokenizer_config.json still has "auto_map": {"AutoTokenizer": [null, "tokenization_kimi_fast.TikTokenTokenizerFast"]} — the null slow tokenizer slot
Exo's Kimi-specific code path requires the slow tokenizer's tiktoken.Encoding object

Workaround
Copy the slow tokenizer from the K2-Instruct repo and lock it read-only:
MODEL_DIR="$HOME/.exo/models/moonshotai--Kimi-K2.6"

curl -sL "https://huggingface.co/moonshotai/Kimi-K2-Instruct/raw/main/tokenization_kimi.py"
-o "$MODEL_DIR/tokenization_kimi.py"

chmod 444 "$MODEL_DIR/tokenization_kimi.py"
chmod 444 "$MODEL_DIR/tokenizer_config.json"

The chmod 444 prevents exo from overwriting the files during metadata re-sync on restart.

Suggested fixes
Don't hard-fail if tokenization_kimi.py is 404 during download — skip it gracefully
Fall back to fast tokenizer when slow tokenizer is missing — adapt _patched_encode to work with TikTokenTokenizerFast or use the fast tokenizer's native encode path
Pin model revision to avoid silent upstream breakage from HF repo changes
Environment
Exo: latest .app from exolabs.net + source at commit 87c72fc
Hardware: 4× Mac Studio M3 Ultra, RDMA/Thunderbolt, tensor sharding
macOS: 26.4.1
Model: moonshotai/Kimi-K2.6 (595GB, unquantized)
Worked before: 2026-05-10 (prior to HF commit 81bcaaa)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions