-
-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditional Numerical Reproducibility (CNR) #288
Comments
Dear David and Gregory, I have investigated this some more, and I think the bottom line is that the differences you are seeing are related to how floating point calculations are handled, and how floating point numbers are approximated on different Intel CPU architectures and on NVIDIA GPUs. There are potentially several factors at play here, and I recommend you have a read of the following articles:
I think the first thing to try is to follow the advice in the Intel CNR document, and force the Intel Math library to follow the same pathway irrespective of the Intel CPU type. It looks like you can do this by setting a the
I would try the The second thing to try is to force the NVIDIA compiler (nvcc) to not use the Fused Multiply-Add (FMA) operation which is possible on GPU but (as far as I know) does not exist on CPU. This may bring the GPU result closer to the CPU one. You can do this by making a small change to the gprMax code (NB no need to recompile gprMax after this change, just run it). If you go into the module 'model_build_run.py' and to line 498, you should see the code
change it to,
The above assumes you are not using MS Windows. If you are then change line 496 to add the above argument to the compiler options list. I am interested to know how this goes, and will give it some more thought in the meantime. I'm also going to add it to our issue tracker on GitHub, so we have a record of it to refer back to in the future. Kind regards, |
Quoted post from our Google Group - https://groups.google.com/g/gprmax/c/KLyUH4pnPxE/m/tEWRC3XpAQAJ
The text was updated successfully, but these errors were encountered: