-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bf16 kernel (OpSet13) for MatMul in CPU EP #20630
Labels
core runtime
issues related to core runtime
feature request
request for unsupported feature or enhancement
Comments
github-actions
bot
added
the
platform:windows
issues related to the Windows platform
label
May 9, 2024
snnn
added
feature request
request for unsupported feature or enhancement
core runtime
issues related to core runtime
and removed
platform:windows
issues related to the Windows platform
labels
May 9, 2024
The reason behind the slow adoption of bf16 in CPU: for training, most models are trained with GPU; for inference, int8 and int4 quantization have better support (no need specified hardware). If you want to implement it, I think you can use hardware specified library. For intel CPU, you can start from the following: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
core runtime
issues related to core runtime
feature request
request for unsupported feature or enhancement
Describe the issue
MatMul in ONNX OpSet 13 started to support bf16 (https://onnx.ai/onnx/operators/onnx__MatMul.html)
However, we dont see the implementation for bfloat16 in the CPU EP for matmul(13), https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/cpu/math/matmul.cc#L61-L89
Is there any reason this is still not supported since OpSet was released long time ago?
If we want to implement it on our own, is there any PR i can reference to?
Ping @snnn @pranavsharma for help
To reproduce
NA
Urgency
No response
Platform
Windows
OS Version
11
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
NA
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: