Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use large models with pipeline() #1179

Open
1 of 5 tasks
sroussey opened this issue Feb 1, 2025 · 6 comments
Open
1 of 5 tasks

Can't use large models with pipeline() #1179

sroussey opened this issue Feb 1, 2025 · 6 comments
Labels
bug Something isn't working

Comments

@sroussey
Copy link
Contributor

sroussey commented Feb 1, 2025

System Info

Example:

        const p = pipeline('text-generation', 'Xenova/Phi-3-mini-4k-instruct', {
            device: 'webgpu',
            dtype: 'q4',
        });

I see this error in the console:

Uncaught (in promise) Error: Can't create a session. ERROR_CODE: 1, ERROR_MESSAGE: Deserialize tensor model.layers.5.mlp.gate_proj.MatMul.weight_Q4 failed.Failed to load external data file ""model_q4.onnx_data"", error: Module.MountedFiles is not available.

Seeing that onnx_data is the issue, I figured I needed to pass use_external_data_format along, but it does not work.

I have tried :

        const p = pipeline('text-generation', 'Xenova/Phi-3-mini-4k-instruct', {
            device: 'webgpu',
            dtype: 'q4',
            use_external_data_format: true,
        });

and

        const p = pipeline('text-generation', 'Xenova/Phi-3-mini-4k-instruct', {
            device: 'webgpu',
            dtype: 'q4',
            session_options: {use_external_data_format: true},
        });

But neither of these will load the model correctly.

Environment/Platform

  • Website/web-app
  • Browser extension
  • Server-side (e.g., Node.js, Deno, Bun)
  • Desktop app (e.g., Electron)
  • Other (e.g., VSCode extension)

Description

see above

Reproduction

see above

@sroussey sroussey added the bug Something isn't working label Feb 1, 2025
@sroussey
Copy link
Contributor Author

sroussey commented Feb 1, 2025

I altered the source to pass use_external_data_format through, and I see that the extra data file loads. However, it seems more difficult to use this way... like how do you use a chat template?

@xenova
Copy link
Collaborator

xenova commented Feb 1, 2025

Thanks for #1180! I don't know why this wasn't added before 👀

However, it seems more difficult to use this way... like how do you use a chat template?

You can just pass in the messages object (and it will do the templating for you). See here for example code.

If you want access to the tokenizer to do things yourself, you can access pipeline.tokenizer.apply_chat_template

@sroussey
Copy link
Contributor Author

sroussey commented Feb 1, 2025

BTW: if you run the code I did above without the fix, it loads most of the files, but not all, and so fails.

Problem number 2 is that if you then fix the transformers code, it will still fail since not all the files were downloaded correctly, but it thinks it was.

I need to open DevTools, go to Application tab, then to Cache Storage, and then delete transformers-cache by right clicking and choosing delete. Only then will the code work.

This seems brittle.

@benc-uk
Copy link

benc-uk commented Feb 16, 2025

Exact same issue in #963

@tangkunyin
Copy link

@benc-uk @sroussey They had lost option parameters, not only use_external_data_format #1200

Hope it will be fixed as soon as possible. Thanks a lot @xenova

@tangkunyin
Copy link

Before the official updates, anyone can use this for temporary

"@huggingface/transformers": "git+https://github.com/tangkunyin/transformers.js.git#develop"

It works for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants