ggerganov / llama.cpp Public

Notifications
Fork 8.2k
Star 57.7k

Code
Issues 352
Pull requests 218
Discussions
Actions
Projects 4
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggerganov/llama.cpp

Labels 46 Milestones 0

New pull request New

218 Open 2,899 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add BPE pre-tokenization for DBRX.

#7132 opened May 7, 2024 by dranger003

Loading…

Fix NFD computation

#7122 opened May 7, 2024 by JoanFM • Draft

2 of 3 tasks

chore: Add model vocab support

#7117 opened May 7, 2024 by teleprint-me • Draft

Add BPE pre-tokenization for Qwen2.

#7114 opened May 7, 2024 by jklj077

Loading…

main : add --conversation / -cnv flag

#7108 opened May 6, 2024 by dawidpotocki

Loading…

tokenization: no double BOS tokens

#7107 opened May 6, 2024 by JohannesGaessler

Loading…

main-interactive-mode: optionally allow for special tokens from user in interactive mode for fill-in-middle etal

#7097 opened May 6, 2024 by hanishkvc

Loading…

Documenting debugging one test without anything else in the loop.

#7096 opened May 6, 2024 by josh-ramer

Loading…

opencl alignment size should be converted from bits to bytes

#7090 opened May 5, 2024 by albertjin

Loading…

Vulkan Bugfixes and Improvements

#7084 opened May 5, 2024 by 0cc4m

Loading…

Add left recursion check: quit early instead of going into an infinite loop

#7083 opened May 5, 2024 by nuchi

Loading…

fix: use vm_allocate instead of posix_memalign for Metal on macOS

#7078 opened May 4, 2024 by giladgd

Loading…

convert-hf : save memory with lazy evaluation

enhancement

New feature or request

high priority

Very important issue

need feedback

Testing and feedback with results are needed

#7075 opened May 4, 2024 by compilade

Loading…

7 tasks done

CUDA: generalize FP16 fattn vec kernel

#7061 opened May 3, 2024 by JohannesGaessler

Loading…

add_special option for server tokenize endpoint

#7059 opened May 3, 2024 by JohanAR

Loading…

Script to convert Grok-1 weights from raw JAX pickle files.

#7058 opened May 3, 2024 by heiner

Loading…

Add token healing example

#7028 opened May 1, 2024 by mare5x • Draft

convert.py: When --vocab-only is passed, generate false but valid params

#7027 opened May 1, 2024 by 20kdc

Loading…

docs: Fix typo and update description for --embeddings flag

#7026 opened May 1, 2024 by louixs

Loading…

Added support for the ArcticForCausalLM.

#7020 opened May 1, 2024 by fairydreaming

Loading…

Fix flash attention for ROCm

#7011 opened Apr 30, 2024 by jdecourval • Draft

add chatglm3-6b model support [help wanted]

#6999 opened Apr 30, 2024 by mnlife • Draft

new tokenizer-verifier tool to check gguf tokenizer parameters

#6988 opened Apr 29, 2024 by anisse

Loading…

llama3 custom regex split

#6965 opened Apr 28, 2024 by jaime-m-p

Loading…

server: avoid breaking KV cache when prompt >= n_ctx

#6958 opened Apr 28, 2024 by prfd

Loading…

Previous 1 2 3 4 5 … 8 9 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly