ONNXRuntimeError with a model that passes onnx.checker and was supported on previous ort versions #20641

cestpasphoto · 2024-05-10T13:52:21Z

Describe the issue

I have a model exported by pytorch which was supported until ort 1.16.3 but now fails with ort 1.17.0:
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (MatMulBnFusion_Gemm) Op (Gemm) [ShapeInferenceError] First input does not have rank 2

I have tried several versions of pytorch, several config of torch.onnx.export but always fail. As soon as I downgrade to 1.16.3, it works whatever the pytorch version is, so that makes me think that ort is the culprit.
Plus I've run onnx.checker.check_model(mymodel, full_check=True) and raised no issue.
It happens even with pure CPU environment.

The following code passes with ort 1.16.3 but returns an error with 1.17.0

To reproduce

import onnx
import onnxruntime as ort

file = 'example_onnx_file.onnx'
mymodel = onnx.load(file)
onnx.checker.check_model(mymodel, full_check=True) # No error

ort.InferenceSession(file, providers=['CPUExecutionProvider']) # Raise an error

Urgency

No response

Platform

Linux

OS Version

Debian stable

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

cestpasphoto · 2024-05-10T13:55:39Z

Please use the following file: https://github.com/cestpasphoto/alpha-zero-general/blob/master/splendor/example_onnx_file.onnx

To answer the auto-label, CUDA is NOT involved here, it happens with CPU-only environment.

csukuangfj · 2024-05-23T09:16:18Z

I get a similar error after switching from onnxruntime 1.17.1 to 1.18.0.

The error log can be found at
https://github.com/csukuangfj/sherpa-onnx/actions/runs/9204107994/job/25316940778#step:15:53

terminate called after throwing an instance of 'Ort::Exception'
  what():  Node (Loop_5471) Op (Loop) 
[TypeInferenceError] Graph attribute inferencing failed: 
Node (Concat_5490) Op (Concat) [
ShapeInferenceError] All inputs to Concat must have same rank. Input 1 has rank 2 != 1

(Note that the only change is to use a newer version of onnxruntime. All other things are kept the same.)

github-actions bot added the ep:CUDA issues related to the CUDA execution provider label May 10, 2024

cestpasphoto mentioned this issue May 10, 2024

ONNXRuntimeErrors trying to run or train Splendor cestpasphoto/alpha-zero-general#5

Open

sophies927 removed the ep:CUDA issues related to the CUDA execution provider label May 16, 2024

edgchen1 added the core runtime issues related to core runtime label May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNXRuntimeError with a model that passes onnx.checker and was supported on previous ort versions #20641

ONNXRuntimeError with a model that passes onnx.checker and was supported on previous ort versions #20641

cestpasphoto commented May 10, 2024 •

edited

cestpasphoto commented May 10, 2024

csukuangfj commented May 23, 2024 •

edited

ONNXRuntimeError with a model that passes onnx.checker and was supported on previous ort versions #20641

ONNXRuntimeError with a model that passes onnx.checker and was supported on previous ort versions #20641

Comments

cestpasphoto commented May 10, 2024 • edited

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

cestpasphoto commented May 10, 2024

csukuangfj commented May 23, 2024 • edited

cestpasphoto commented May 10, 2024 •

edited

csukuangfj commented May 23, 2024 •

edited