Commit f427399
Ralf Waldukat
fix: critical fixes for recurrent/hybrid model support
After external code review (GPT-5.2), fixed 4 critical issues:
1. CRITICAL: Fixed tokens[:-1] bug in prefix matching
- Was silently breaking prefix matching for ALL models
- Caused false rewind detection and cache inefficiency
- Impact: Transformers AND recurrent models
2. CRITICAL: Implement proper reset() for recurrent models
- Now actually clears llama_memory backend state
- Root cause fix for 'sequence positions not consecutive' crash
- Without this, reset was a no-op for recurrent models
3. CRITICAL: Enforce strict append policy for recurrent models
- Prevents KV cache rewinding that's impossible without state snapshots
- Forces full reset on history edits instead of crashing
4. Performance: Cache _is_recurrent to avoid repeated FFI calls
5. Documentation: Simplified comments and updated docstring
6. Testing: All existing tests pass + Mistral-Small-3.2-24B validated
Resolves multi-turn crashes for Nemotron-A3B, Mamba, RWKV, Jamba models.
Reviewed-by: GPT-5.2 (OpenAI)
Tested-by: pytest + Mistral-Small-3.2-24B
Fixes: #2108 (recurrent model crashes)
Compatible-with: #2109 (Granite-Docling/SmolVLM special tokens)1 parent 831dbe5 commit f427399
1 file changed
+41
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
190 | 190 | | |
191 | 191 | | |
192 | 192 | | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
193 | 198 | | |
194 | 199 | | |
195 | 200 | | |
| |||
555 | 560 | | |
556 | 561 | | |
557 | 562 | | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
558 | 568 | | |
559 | 569 | | |
560 | 570 | | |
| |||
582 | 592 | | |
583 | 593 | | |
584 | 594 | | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
585 | 608 | | |
586 | 609 | | |
587 | 610 | | |
| |||
640 | 663 | | |
641 | 664 | | |
642 | 665 | | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
643 | 671 | | |
644 | 672 | | |
645 | 673 | | |
| |||
891 | 919 | | |
892 | 920 | | |
893 | 921 | | |
894 | | - | |
| 922 | + | |
895 | 923 | | |
896 | 924 | | |
897 | 925 | | |
898 | 926 | | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
| 932 | + | |
| 933 | + | |
| 934 | + | |
| 935 | + | |
| 936 | + | |
| 937 | + | |
899 | 938 | | |
900 | 939 | | |
901 | 940 | | |
902 | 941 | | |
903 | 942 | | |
904 | 943 | | |
905 | | - | |
906 | | - | |
| 944 | + | |
907 | 945 | | |
908 | 946 | | |
909 | 947 | | |
| |||
0 commit comments