RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library.

### System Info

GPU: NVIDIA A6000
python 3.10

### Environment/Platform

- [ ] Website/web-app
- [ ] Browser extension
- [x] Server-side (e.g., Node.js, Deno, Bun)
- [ ] Desktop app (e.g., Electron)
- [ ] Other (e.g., VSCode extension)

### Description

I’m not sure why I can successfully export the Llama2 model from Hugging Face to ONNX, but the DeepSeek model fails, even though both are 7B models with similar sizes. I’m using the same command for both exports, yet Llama2 works without any issues while DeepSeek doesn’t. Could you help me understand why this might be happening?

Here is my command:
llama2: (worked well)
python -m scripts.convert   --model_id /mnt/data/ehdd1/home/models/hf/Llama-2-7b-chat-hf/Llama-2-7b-chat-hf/   --output_parent_dir /mnt/data/ehdd1/home/models/onnx/Llama-2-7b-chat-hf/fp16/   --skip_onnxslim   --quantize False   --task text-generation

deepseek: (failed)
python -m scripts.convert   --model_id /mnt/data/ehdd1/home/models/hf/deepseek-llm-7b-chat/deepseek-llm-7b-chat/   --output_parent_dir /mnt/data/ehdd1/home/models/onnx/deepseek-llm-7b-chat/fp16/   --skip_onnxslim   --quantize False   --task text-generation


bug:
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/myid/kz96891/Code/non-determinism/framework/transformers.js/scripts/convert.py", line 455, in <module>
    main()
  File "/home/myid/kz96891/Code/non-determinism/framework/transformers.js/scripts/convert.py", line 342, in main
    main_export(**export_kwargs)
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/optimum/exporters/onnx/__main__.py", line 373, in main_export
    onnx_export_from_model(
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/optimum/exporters/onnx/convert.py", line 1199, in onnx_export_from_model
    _, onnx_outputs = export_models(
                      ^^^^^^^^^^^^^^
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/optimum/exporters/onnx/convert.py", line 786, in export_models
    export(
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/optimum/exporters/onnx/convert.py", line 892, in export
    export_output = export_pytorch(
                    ^^^^^^^^^^^^^^^
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/optimum/exporters/onnx/convert.py", line 585, in export_pytorch
    onnx_export(
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/torch/onnx/__init__.py", line 375, in export
    export(
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/torch/onnx/utils.py", line 503, in export
    _export(
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/torch/onnx/utils.py", line 1565, in _export
    graph, params_dict, torch_out = _model_to_graph(
                                    ^^^^^^^^^^^^^^^^
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/torch/onnx/utils.py", line 1118, in _model_to_graph
    graph = _optimize_graph(
            ^^^^^^^^^^^^^^^^
  File "/home/myid/kz96891/anaconda3/envs/python3.10/lib/python3.12/site-packages/torch/onnx/utils.py", line 664, in _optimize_graph
    _C._jit_pass_onnx_graph_shape_type_inference(
RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. Therefore the output file must be a file path, so that the ONNX external data can be written to the same directory. Please specify the output file name.


Plz, is there any way to solve this? I don't know what's the factor fr.

### Reproduction

python -m scripts.convert   --model_id /mnt/data/ehdd1/home/models/hf/deepseek-llm-7b-chat/deepseek-llm-7b-chat/   --output_parent_dir /mnt/data/ehdd1/home/models/onnx/deepseek-llm-7b-chat/fp16/   --skip_onnxslim   --quantize False   --task text-generation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. #1185

System Info

Environment/Platform

Description

Reproduction

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. #1185

Description

System Info

Environment/Platform

Description

Reproduction

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions