-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potku incorrectly assumes that the entire simulation has stopped when just one MCERD process crashes #84
Comments
Are MCERD processes prone to crashing on their own, or does this only occur when done manually? In either case, killing all processes is probably the best option. If a process crashes, there are likely bigger issues than worrying about finishing the rest. |
In my experience, MCERD most likely crashes because some Jibal data file is missing, or there is a problem with simulation settings, such as no target layers. These problems would cause all processes to crash immediately. It seems unlikely that a certain random seed could cause a crash while other random seeds work. So if only one process drops, it could be an indication of some outside shenanigans. |
@jussiks is correct that typically crashing is due to some outside influence and killing everything in sight is probably acceptable behaviour. MCERD should not crash on its own. That being said, MCERD due to the pseudorandom nature of MC simulations (and laziness of the programmers) can crash with some (small) probability, affected by the random seed. Typically an assertion is missing and an out-of-bounds array index leads to a segmentation fault. I have introduced/fixed these kinds of bugs to/from MCERD. Obviously in this case the only solution is to fix that particular bug and learn some defensive programming. |
When running multiple simulation processes, Potku assumes that the entire simulation has stopped with an error when only one of the processes has crashed. Remaining processes are left running in the background, unobserved by Potku. They can no longer be stopped from the GUI, nor is the observed atom count increased until the simulation is continued.
Reproduction
This bug can be reproduced by forcibly killing one of the MCERD processes. In the picture below I have started 4 processes from the GUI, and killed one from the command line.
ps -a | grep mcerd
shows that there are still three processes running despite the GUI showing them all as stopped.Same thing can be done on Windows using the task manager.
How to fix
Either:
Generally speaking, if one process crashes, others are likely to crash too as the only difference between them is the random seed. In this case, Potku's current behaviour isn't much of a problem since the runaway processes won't stay alive for too long.
The text was updated successfully, but these errors were encountered: