Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nontensor gen operators #1735

Merged
merged 4 commits into from
Feb 3, 2025
Merged

Nontensor gen operators #1735

merged 4 commits into from
Feb 3, 2025

Conversation

jeremylt
Copy link
Member

@jeremylt jeremylt commented Jan 27, 2025

Non-tensor gen operators 🎉

Closes #839

Note: IBM Power test failure is unrelated to this PR

@jeremylt jeremylt self-assigned this Jan 27, 2025
@jeremylt jeremylt force-pushed the jeremy/nontensor-gen branch 3 times, most recently from 3d889fe to 1af2191 Compare January 27, 2025 19:29
@jedbrown
Copy link
Member

Cool! Any preliminary performance comparison?

@jeremylt
Copy link
Member Author

Not yet, still chasing some odd bugs here and there

@jeremylt jeremylt force-pushed the jeremy/nontensor-gen branch 2 times, most recently from 953140c to c869107 Compare January 28, 2025 22:40
@jeremylt
Copy link
Member Author

Ok, pure non-tensor works now in 1, 2, and 3D. I need to get operators with a mix of tensor and non-tensor elements working, or put in bypass for now

@jrwrigh
Copy link
Collaborator

jrwrigh commented Jan 28, 2025

I need to get operators with a mix of tensor and non-tensor elements working, or put in bypass for now

You mean as a single operator or as suboperators? I had imagined that each operators would have to work with a single element type. So a composite operator would have 1 suboperator per element type.

Edit: Continuing that thought, a CeedBasis would only be valid for a single element type anyways, so I think the single-element-type-per-operator approach makes sense from that limitation as well.

@jeremylt
Copy link
Member Author

I'm talking about within a single operator. Operators must share a quadrature space but can have different element types feeding into the same quadrature space. Some of our operators on faces are like that in Ratel.

@jrwrigh
Copy link
Collaborator

jrwrigh commented Jan 28, 2025

As in surface-based CeedBasis being used for a volume operator?

@jeremylt
Copy link
Member Author

One instance off the top of my head is computing geometric factors on the face with some full cell data

@jeremylt
Copy link
Member Author

ToDo: Write up issues on follow-up tasks, run test suite on my ROCm machine

@jeremylt
Copy link
Member Author

jeremylt commented Jan 30, 2025

on my local machine, I'm seeing for shared

    DoFs/Sec in CG                          : 29.1799 (29.1799) million

but for gen

    DoFs/Sec in CG                          : 9.70631 (9.70631) million

when testing BP3. Digging around currently

With the same mesh though, gen can run BP4 and shared cannot, so we are getting a big memory savings here.

@jeremylt
Copy link
Member Author

jeremylt commented Jan 30, 2025

Yup, gen goes up to

    DoFs/Sec in CG                          : 33.0333 (33.0333) million

if I used the thread block strategy from shared bases. So a slight improvement in perf but the real win is the big decrease in memory used.

@jeremylt jeremylt force-pushed the jeremy/nontensor-gen branch from 6f28039 to f82027a Compare January 30, 2025 22:01
@jeremylt
Copy link
Member Author

I'm investigating why the ARM/IBM Power jobs are misbehaving but its unrelated to this PR so we can merge it without those passing

@jeremylt jeremylt merged commit 6b50d2b into main Feb 3, 2025
28 of 29 checks passed
@jeremylt jeremylt deleted the jeremy/nontensor-gen branch February 3, 2025 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CUDA/HIP Backend Refactor
3 participants