gguf-py: add support for I8, I16 and I32 #6045

certik · 2024-03-13T20:19:41Z

These types are documented at https://github.com/ggerganov/ggml/blob/9c2adc4962a3a5d259f10db2171e0df5c83e4b05/docs/gguf.md, and implemented in C at

llama.cpp/ggml.h

Line 340 in 19885d2

enum ggml_type {

. This PR adds support for them in the Python GGUF library.

This code is equivalent as before, but now it is prepared to easily add more NumPy dtypes.

These types are allowed in the GGUF specification.

ggerganov · 2024-03-14T09:44:36Z

Will merge after #6050

certik · 2024-03-14T15:38:36Z

@ggerganov my apologies for the tensor_shape vs tensor_dtype mistake --- I just discovered it this morning as well. I thought I tested it carefully, but I missed this. Thanks for fixing it!

ggerganov · 2024-03-14T15:57:44Z

No problem. Btw, I think we need to update the gguf-py version

certik · 2024-03-14T17:53:58Z

Btw, I think we need to update the gguf-py version

I sent a PR to do so here: #6060.

* Refactor dtype handling to be extensible This code is equivalent as before, but now it is prepared to easily add more NumPy dtypes. * Add support for I8, I16 and I32 These types are allowed in the GGUF specification. * Add support for I8, I16 and I32 to gguf_writer * Add support for I8, I16, I32 to gguf_reader

Bring `GGMLQuantizationType` up to date; adds `I8`, `I16`, `I32`, `I64`, `F64`, `IQ1_M` and `BF16`. Added in: * ggerganov/llama.cpp#6045 * ggerganov/llama.cpp#6062 * ggerganov/llama.cpp#6302 * ggerganov/llama.cpp#6412

Refactor dtype handling to be extensible

b7e9d5c

This code is equivalent as before, but now it is prepared to easily add more NumPy dtypes.

certik mentioned this pull request Mar 13, 2024

Binary format choice certik/mlc#34

Open

certik added 3 commits March 13, 2024 14:26

Add support for I8, I16 and I32

dc0e4d8

These types are allowed in the GGUF specification.

Add support for I8, I16 and I32 to gguf_writer

c542375

Add support for I8, I16, I32 to gguf_reader

fc5d6e6

certik force-pushed the gguf_writer branch from b4d2c4d to fc5d6e6 Compare March 13, 2024 20:26

certik mentioned this pull request Mar 14, 2024

Crash generating mnist-tests.gguf certik/mlc#36

Closed

ggerganov approved these changes Mar 14, 2024

View reviewed changes

ggerganov merged commit 3ca2348 into ggerganov:master Mar 14, 2024
21 checks passed

ggerganov added a commit that referenced this pull request Mar 14, 2024

gguf-py : fix dtype check (#6045)

77178ee

certik deleted the gguf_writer branch March 14, 2024 14:13

certik mentioned this pull request Mar 14, 2024

gguf-py : bump version to 0.8.0 #6060

Merged

NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 15, 2024

gguf-py : fix dtype check (ggerganov#6045)

42b03c4

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

gguf-py : fix dtype check (ggerganov#6045)

7a74e9a

CISC mentioned this pull request Jun 3, 2024

Update GGUF quantization types huggingface/huggingface.js#729

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf-py: add support for I8, I16 and I32 #6045

gguf-py: add support for I8, I16 and I32 #6045

certik commented Mar 13, 2024

ggerganov commented Mar 14, 2024

certik commented Mar 14, 2024 •

edited

ggerganov commented Mar 14, 2024

certik commented Mar 14, 2024

gguf-py: add support for I8, I16 and I32 #6045

gguf-py: add support for I8, I16 and I32 #6045

Conversation

certik commented Mar 13, 2024

ggerganov commented Mar 14, 2024

certik commented Mar 14, 2024 • edited

ggerganov commented Mar 14, 2024

certik commented Mar 14, 2024

certik commented Mar 14, 2024 •

edited