metal : pad n_ctx by 32 #6177

ggerganov · 2024-03-20T14:56:02Z

fix #6173

We were padding kv_self.n but not n_ctx, leading to unaligned memory access with Metal

ggml-ci

* metal : require ne00 >= 128 for mat-mat kernels ggml-ci * llama : pad n_ctx by 32 ggml-ci

metal : require ne00 >= 128 for mat-mat kernels

712b5d6

ggml-ci

ggerganov mentioned this pull request Mar 20, 2024

Regression: llama.cpp produces nonsensical outputs when using batched decoding on Metal #6173

Closed

llama : pad n_ctx by 32

1d6112b

ggml-ci

ggerganov changed the title ~~metal : require ne00 >= 128 for mat-mat kernels~~ metal : pad n_ctx by 32 Mar 21, 2024

ggerganov merged commit 95d576b into master Mar 22, 2024
62 of 64 checks passed

ggerganov deleted the gg/metal-fix-mm branch March 22, 2024 07:36

ggerganov mentioned this pull request Mar 22, 2024

metal : proper assert for mat-mat memory alignment #6225

Merged

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

metal : pad n_ctx by 32 (ggerganov#6177)

8dbf82a

* metal : require ne00 >= 128 for mat-mat kernels ggml-ci * llama : pad n_ctx by 32 ggml-ci

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 3, 2024

metal : pad n_ctx by 32 (ggerganov#6177)

577314a

* metal : require ne00 >= 128 for mat-mat kernels ggml-ci * llama : pad n_ctx by 32 ggml-ci

tybalex pushed a commit to tybalex/function.cpp that referenced this pull request Apr 17, 2024

metal : pad n_ctx by 32 (ggerganov#6177)

fcefcea

* metal : require ne00 >= 128 for mat-mat kernels ggml-ci * llama : pad n_ctx by 32 ggml-ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metal : pad n_ctx by 32 #6177

metal : pad n_ctx by 32 #6177

ggerganov commented Mar 20, 2024 •

edited

metal : pad n_ctx by 32 #6177

metal : pad n_ctx by 32 #6177

Conversation

ggerganov commented Mar 20, 2024 • edited

ggerganov commented Mar 20, 2024 •

edited