Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNXRuntimeError with a model that passes onnx.checker and was supported on previous ort versions #20641

Open
cestpasphoto opened this issue May 10, 2024 · 2 comments
Labels
core runtime issues related to core runtime

Comments

@cestpasphoto
Copy link

cestpasphoto commented May 10, 2024

Describe the issue

I have a model exported by pytorch which was supported until ort 1.16.3 but now fails with ort 1.17.0:
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (MatMulBnFusion_Gemm) Op (Gemm) [ShapeInferenceError] First input does not have rank 2

I have tried several versions of pytorch, several config of torch.onnx.export but always fail. As soon as I downgrade to 1.16.3, it works whatever the pytorch version is, so that makes me think that ort is the culprit.
Plus I've run onnx.checker.check_model(mymodel, full_check=True) and raised no issue.
It happens even with pure CPU environment.

The following code passes with ort 1.16.3 but returns an error with 1.17.0

To reproduce

import onnx
import onnxruntime as ort

file = 'example_onnx_file.onnx'
mymodel = onnx.load(file)
onnx.checker.check_model(mymodel, full_check=True) # No error

ort.InferenceSession(file, providers=['CPUExecutionProvider']) # Raise an error

Urgency

No response

Platform

Linux

OS Version

Debian stable

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@github-actions github-actions bot added the ep:CUDA issues related to the CUDA execution provider label May 10, 2024
@cestpasphoto
Copy link
Author

Please use the following file: https://github.com/cestpasphoto/alpha-zero-general/blob/master/splendor/example_onnx_file.onnx

To answer the auto-label, CUDA is NOT involved here, it happens with CPU-only environment.

@sophies927 sophies927 removed the ep:CUDA issues related to the CUDA execution provider label May 16, 2024
@edgchen1 edgchen1 added the core runtime issues related to core runtime label May 16, 2024
@csukuangfj
Copy link

csukuangfj commented May 23, 2024

I get a similar error after switching from onnxruntime 1.17.1 to 1.18.0.

The error log can be found at
https://github.com/csukuangfj/sherpa-onnx/actions/runs/9204107994/job/25316940778#step:15:53

terminate called after throwing an instance of 'Ort::Exception'
  what():  Node (Loop_5471) Op (Loop) 
[TypeInferenceError] Graph attribute inferencing failed: 
Node (Concat_5490) Op (Concat) [
ShapeInferenceError] All inputs to Concat must have same rank. Input 1 has rank 2 != 1

(Note that the only change is to use a newer version of onnxruntime. All other things are kept the same.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core runtime issues related to core runtime
Projects
None yet
Development

No branches or pull requests

4 participants