New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have the plan to support fp8 inference? #19671
Comments
@james77777778 any thoughts on this? |
If the model is trained with fp8, it is ready for inference. We can fix the scaling factor and drop the If the model is not trained with fp8 and we don't plan to train it in the future, we need a mechanism to calibrate it. Calibration is similar to fp8 training but we only need to compute the scaling factor offline with an additional calibration dataset. I'm unsure whether we should add the calibration logic into Keras. |
Thanks for your reply. It seems keras need more discussion to decide whether to support fp8 calibration. Maybe you can update the latest progress if have any result in the future. |
And for fp8 inference after fp8 training, keras seems support not well. Can we add is_training argument in float8_call to decide whether to compute new scale? New amax history is also not need. |
Since #19682 has been merged, you can set |
Thanks, will test it soon. |
fp8 training is supported in keras. Does keras have plan to support fp8 inference? Maybe naive solution is enough like TransformerEngine.
The text was updated successfully, but these errors were encountered: