Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DequantizeLinear spec clarification: What happens if the subtraction overflows/underflows? #6132

Open
TinaAMD opened this issue May 7, 2024 · 1 comment
Labels
question Questions about ONNX spec clarification Clarification of the ONNX spec needed

Comments

@TinaAMD
Copy link

TinaAMD commented May 7, 2024

Ask a Question

Question

Dequantize linear defines the dequantization formula as y = (x - x_zero_point) * x_scale and that x and x_zero_point must have the same type dtype.
What is the intended behavior if x - x_zero_point is outside of the range of dtype?

Further information

I noticed that the reference implementation converts to float32 before doing the subtraction, while the onnx runtime casts to int32_t.

@TinaAMD TinaAMD added the question Questions about ONNX label May 7, 2024
@justinchuby justinchuby added the spec clarification Clarification of the ONNX spec needed label May 7, 2024
@justinchuby
Copy link
Contributor

@onnx/sig-operators

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Questions about ONNX spec clarification Clarification of the ONNX spec needed
Projects
None yet
Development

No branches or pull requests

2 participants