-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nontensor gen operators #1735
Nontensor gen operators #1735
Conversation
3d889fe
to
1af2191
Compare
Cool! Any preliminary performance comparison? |
Not yet, still chasing some odd bugs here and there |
953140c
to
c869107
Compare
Ok, pure non-tensor works now in 1, 2, and 3D. I need to get operators with a mix of tensor and non-tensor elements working, or put in bypass for now |
You mean as a single operator or as suboperators? I had imagined that each operators would have to work with a single element type. So a composite operator would have 1 suboperator per element type. Edit: Continuing that thought, a CeedBasis would only be valid for a single element type anyways, so I think the single-element-type-per-operator approach makes sense from that limitation as well. |
I'm talking about within a single operator. Operators must share a quadrature space but can have different element types feeding into the same quadrature space. Some of our operators on faces are like that in Ratel. |
As in surface-based CeedBasis being used for a volume operator? |
One instance off the top of my head is computing geometric factors on the face with some full cell data |
c869107
to
dc007f0
Compare
ToDo: Write up issues on follow-up tasks, run test suite on my ROCm machine |
d8c2853
to
9123fb0
Compare
on my local machine, I'm seeing for shared
but for gen
when testing BP3. Digging around currently With the same mesh though, gen can run BP4 and shared cannot, so we are getting a big memory savings here. |
Yup, gen goes up to
if I used the thread block strategy from shared bases. So a slight improvement in perf but the real win is the big decrease in memory used. |
6f28039
to
f82027a
Compare
I'm investigating why the ARM/IBM Power jobs are misbehaving but its unrelated to this PR so we can merge it without those passing |
Non-tensor gen operators 🎉
Closes #839
[ ] Support tensor + non-tensor mix (follow-up)Gen Tensor/NonTensor Mixed #1736[ ] Duplicate mat loads (follow-up)De-duplicate Gen Basis Matrices #1737Note: IBM Power test failure is unrelated to this PR