Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in modified locateParticles #274

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

srikrrish
Copy link
Member

The modified version of locateParticles with nearest neighbor search has a bug which leads to lack of charge conservation or seg faults in TestScatter and PenningTrap with load balancing in both CPUs and GPUs.

@srikrrish srikrrish linked an issue Mar 26, 2024 that may be closed by this pull request
@srikrrish srikrrish marked this pull request as draft March 26, 2024 13:22
@srikrrish srikrrish force-pushed the 273-fix-bug-in-modified-locateparticles branch from aa597f9 to 56ed250 Compare May 8, 2024 08:01
@srikrrish
Copy link
Member Author

The last commit still doesn't fix the issue. Basically I observe two issues for the problem
srun ./PenningTrap 32 32 32 655360 400 FFT 0.01 LeapFrog -b 1.0 --info 5

  1. On OpenMP builds with 4 nodes, 16 taskspernode (64 MPI ranks) and 2 OMP threads the code runs and produces output and timing files but doesn't terminate and had to be manually cancelled
  2. On GPU builds with 16 nodes and 4 GPUs per node (again 64 MPI ranks/GPUs) there is charge conservation error on the first step itself
Warning: Option '32' is not parsed by Ippl.
Warning: Option '32' is not parsed by Ippl.
Warning: Option '32' is not parsed by Ippl.
Warning: Option '655360' is not parsed by Ippl.
Warning: Option '400' is not parsed by Ippl.
Warning: Option 'FFT' is not parsed by Ippl.
Warning: Option '0.01' is not parsed by Ippl.
Warning: Option 'LeapFrog' is not parsed by Ippl.
Pre Run{0}> Discretization:
Pre Run{0}> nt 400 Np= 655360 grid = ( 32 , 32 , 32 )
Initialize Particles{0}> Starting first repartition
Initialize Particles{0}> particles created and initial conditions assigned
scatter {0}> 0
PenningTrap{0}> Starting iterations ...
Pre-step{0}> Done
scatter {0}> 0.0544169
scatter {0}> Time step: 0
scatter {0}> Total particles in the sim. 655360 after update: 655360
scatter {0}> Rel. error in charge conservation: 0.0544169

@aaadelmann
Copy link
Member

aaadelmann commented May 8, 2024

A few thoughts to 1: a) we get stuck in a destructor can you add prints in Ippl::finalize()?

void finalize() { Comm->deleteAllBuffers(); Kokkos::finalize(); // we must first delete the communicator and // afterwards the MPI environment Comm.reset(nullptr); Env.reset(nullptr); }
2: maybe the write timing has a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix bug in modified locateParticles
3 participants