Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IJ assembly memory #1099

Open
wants to merge 33 commits into
base: master
Choose a base branch
from
Open

IJ assembly memory #1099

wants to merge 33 commits into from

Conversation

liruipeng
Copy link
Contributor

@liruipeng liruipeng commented Jun 17, 2024

This PR adds early assembly option to optimize the memory usage in the IJ interface to assemble matrices on GPUs.

User level APIs are

HYPRE_IJMatrixSetEarlyAssemble
HYPRE_SStructMatrixSetEarlyAssembly

Copy link
Contributor

@rfalgout rfalgout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Thank!

Copy link
Contributor

@victorapm victorapm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Rui Peng!

stack_elmts_max_new = stack_elmts_required;
if (early_assemble == 1)
{
early_assemble_flag = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow everything here. Why do you have both early_assemble and early_assemble_flag? Also, why do you check early_assemble in the outer if statement and then check again if early_assemble == 1 inside? Does early_assemble take other non-zero values besides 1? I had thought it was just a flag to turn early assemble on/off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @waynemitchell Thanks for catching this. In the old code before we fixed the code hang issue, we had a situation that early_assemble = 2. This was used when we still did a communication step in early assemble that calls Addvalues which possible triggers another early assemble, and another communication, and even possibly another early assembly and so on ... recursively. So we hacked it by setting it to 2 by temporally disable early assemble when communicating for other processes' entries. Now that the whole communication step in early assemble is gone, the =2 thing is no longer there which leaves the silly code in place

          if (early_assemble) // 0 or 1
          {
             ...
             if (early_assemble == 1)

I have updated the code slightly, removing the inner if. The logic should be simple and clear now: early_assemble is a global flag that turns on/off this feature and early_assemble_flag is locally per each call of Addvalues. When out of memory it triggers the assemble action at the end of hypre_IJMatrixSetAddValuesParCSRDevice.

Hope this explains.

Thanks

Copy link
Contributor

@waynemitchell waynemitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had one question above, but this all looks good. Also, I'll give a +1 to Victor's suggestion to document the new user-level functions. Thanks, Rui Peng!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants