-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IJ assembly memory #1099
base: master
Are you sure you want to change the base?
IJ assembly memory #1099
Conversation
…memory Conflicts: src/test/ij_assembly.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. Thank!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Rui Peng!
src/IJ_mv/IJMatrix_parcsr_device.c
Outdated
stack_elmts_max_new = stack_elmts_required; | ||
if (early_assemble == 1) | ||
{ | ||
early_assemble_flag = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow everything here. Why do you have both early_assemble
and early_assemble_flag
? Also, why do you check early_assemble
in the outer if
statement and then check again if early_assemble == 1
inside? Does early_assemble
take other non-zero values besides 1? I had thought it was just a flag to turn early assemble on/off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @waynemitchell Thanks for catching this. In the old code before we fixed the code hang issue, we had a situation that early_assemble = 2
. This was used when we still did a communication step in early assemble that calls Addvalues
which possible triggers another early assemble, and another communication, and even possibly another early assembly and so on ... recursively. So we hacked it by setting it to 2
by temporally disable early assemble when communicating for other processes' entries. Now that the whole communication step in early assemble is gone, the =2
thing is no longer there which leaves the silly code in place
if (early_assemble) // 0 or 1
{
...
if (early_assemble == 1)
I have updated the code slightly, removing the inner if
. The logic should be simple and clear now: early_assemble
is a global flag that turns on/off this feature and early_assemble_flag
is locally per each call of Addvalues
. When out of memory it triggers the assemble action at the end of hypre_IJMatrixSetAddValuesParCSRDevice
.
Hope this explains.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had one question above, but this all looks good. Also, I'll give a +1 to Victor's suggestion to document the new user-level functions. Thanks, Rui Peng!
Co-authored-by: Victor A. P. Magri <[email protected]>
Co-authored-by: Victor A. P. Magri <[email protected]>
Co-authored-by: Victor A. P. Magri <[email protected]>
This PR adds early assembly option to optimize the memory usage in the IJ interface to assemble matrices on GPUs.
User level APIs are