Speeding up models with mixed precision policy #143
Replies: 5 comments
-
Using |
Beta Was this translation helpful? Give feedback.
-
I'm not sure, may try creating a new virtualenv and install newest tensorflow with CUDA dependencies: pip3 install virtualenv
virtualenv -p python3 ~/virtualenvs/tf_test
source ~/virtualenvs/tf_test/bin/activate
pip3 install tensorflow[and-cuda] # TF>2.14.0 with CUDA dependencies
pip3 install pillow h5py tqdm Also may try if XLA compiling helps: TF_XLA_FLAGS="--tf_xla_auto_jit=2" CUDA_VISIBLE_DEVICES=0 python3 train_script.py -m EfficientNetV2B0 -p adamw Also Note if any CUDA related warning or error printed when importing TF: If all these not working, May also try update nvidia driver version, I once met some similar issue long time ago Conv2D is 20x slower using mixed_float16 in training process, but solved after updating driver version... |
Beta Was this translation helpful? Give feedback.
-
Ok, I'll keep trying different things. |
Beta Was this translation helpful? Give feedback.
-
I tried another version of Tensorflow and it worked. This appears to be a TF 2.10 issue. |
Beta Was this translation helpful? Give feedback.
-
Interesting, previously I'm mostly using TF 2.10. I think maybe something related with TF and CUDA or nvidia driver version. |
Beta Was this translation helpful? Give feedback.
-
Previously I had success speeding up you EfficientNetV2 with
mixed_precision
policy (https://www.tensorflow.org/guide/mixed_precision?hl=en). Training became like 2.5 times faster if I remember correctly.But I updated Tensorflow and all dependecies to try other models, and now
mixed_precision
doesn't work. Instead it slowing training twice. Sad thing, now EfficientNetV2 became slower withmixed_precision
.Before updating environment, I had about 35-40 mins per epoch with
mixed_precision
. Now with same data it's 1h per epoch, and 2h per epoch withmixed_precision
.I spent whole day trying to reproduce old environment, but it looks like I've installed EfficientNetV2 package via
conda
, and now I can't find it. Pip installation is not compatible with an old environment.Anyway, it's just a picture how I ruined my installation, and now my training is 2 times slower, which is disasterous.
I hope, you could give some advices what wrong with speeding up model. I've checked - all layers have 32float dtype, and 16float dtype output (as it has to be after accepting policy) except output softmax activation. So everything looks correct.
Beta Was this translation helpful? Give feedback.
All reactions