Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Internal error when fine-tuning Gemma #1363

Open
awni opened this issue Aug 26, 2024 · 0 comments
Open

[BUG] Internal error when fine-tuning Gemma #1363

awni opened this issue Aug 26, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@awni
Copy link
Member

awni commented Aug 26, 2024

E.g.:

mlx_lm.lora --model mlx-community/codegemma-7b-it-8bit --train --adapter-path adapters_codegemma_7B --data training_data --iters 500

Can result in:

libc++abi: terminating due to uncaught exception of type std::runtime_error: [METAL] Command buffer execution failed: Internal Error (0000000e:Internal Error)
zsh: abort      mlx_lm.lora --model mlx-community/codegemma-7b-it-8bit --train --adapter-path

The splikt matmul on the output which has a very large inner dimension (256k) appears to be the culprit. @jagrit06 is looking into this.

@awni awni added the bug Something isn't working label Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants
@awni @jagrit06 and others