Skip to content

Pull requests: ggerganov/llama.cpp

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add BPE pre-tokenization for DBRX.
#7132 opened May 7, 2024 by dranger003 Loading…
Fix NFD computation
#7122 opened May 7, 2024 by JoanFM Draft
2 of 3 tasks
Add BPE pre-tokenization for Qwen2.
#7114 opened May 7, 2024 by jklj077 Loading…
main : add --conversation / -cnv flag
#7108 opened May 6, 2024 by dawidpotocki Loading…
tokenization: no double BOS tokens
#7107 opened May 6, 2024 by JohannesGaessler Loading…
Vulkan Bugfixes and Improvements
#7084 opened May 5, 2024 by 0cc4m Loading…
convert-hf : save memory with lazy evaluation enhancement New feature or request high priority Very important issue need feedback Testing and feedback with results are needed
#7075 opened May 4, 2024 by compilade Loading…
7 tasks done
CUDA: generalize FP16 fattn vec kernel
#7061 opened May 3, 2024 by JohannesGaessler Loading…
add_special option for server tokenize endpoint
#7059 opened May 3, 2024 by JohanAR Loading…
Add token healing example
#7028 opened May 1, 2024 by mare5x Draft
Fix flash attention for ROCm
#7011 opened Apr 30, 2024 by jdecourval Draft
llama3 custom regex split
#6965 opened Apr 28, 2024 by jaime-m-p Loading…
server: avoid breaking KV cache when prompt >= n_ctx
#6958 opened Apr 28, 2024 by prfd Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.