4RDM implementation #105

DanielCalero1 · 2025-03-25T18:40:50Z

This PR addresses issue #85, it includes the following implementations:

Add to compute_rdms_1234() in rdm.cpp the code to compute the elements of the 4RDM
Update the python bindings rdm.cpp as well as the bindings in binding.cpp and pyci.h
Update the function spinize_rdms_1234 in utility.py to receive the vector elements of the 4RDM, create the general 4RDM matrix and symmetrize it.
Update the function spin_free_rdms() in utility.py to compute the spin-free 4RDM
Add new tests on test_routines.py to check the correctness of the 4RDM.

Besides that, I think I should add an option in the function spinize_rdms_1234 to allow the user select if they only want to compute the 3RDM or the 4RDM.
What would you think could be the best way of doing that? I thought of adding a flag to the function. This flag can have three values: "3rdm", "4rdm" or "34rdm". Depending on the choice, the 3rdm or/and the 4RDM will be computed.

Do you have another suggestion?

…ms_34

DanielCalero1 · 2025-03-25T20:11:51Z

I found a problem, when is fixed I will reopen the pull-request

msricher · 2025-03-25T20:33:01Z

Hi, thanks for the hard work! I think having a flag determine which RDMs to compute is a good way to do it.

There's some dynamic programming you can do too, if you want... basically you'd have an array of numbers/functions [X...] corresponding to which RDM you want, and at each level in the loop (for p..., for q...,etc.) you'd iterate through the list and call some corresponding update_rdm_X() to update the state of the RDMs you're currently computing. Essentially, the loop you have now would be a static loop, but you'd be doing dynamically-chosen computations at each level of the loop. You'd likely need some kind of class and method to store the intermediates, which would then be gathered into a tuple and returned from the top level function.

Func

{Determine which RDMs to compute, e.g. 1 and 4, make a list of some
callable objects containing corresponding intermediate RDM state,
e.g. comps = [RDMState1(), RDMState4()].}

For p...

    For comp in comps:
        comp.update_p(p)

    For q...

        For comp in comps:
            comp.update_pq(p, q)

        ....

    Endfor
Endfor

return [comp.rdm for comp in comps]

Endfunc

Doing it that way probably doesn't give us much, and the operation is slow anyway, so I'm happy to just use a flag and if/else.

On second thought, a flag is way simpler and probably faster.

DanielCalero1 · 2025-04-02T19:45:46Z

I fix some problems I had with the mixed spin blocks. Now it is giving the correct result. I made test with BH3, H8 and H10 in compute Canada and all of them are working alright.
I also add the option to compute only the 3RDM or both the 3- and 4RDM. I did not add the option to compute only the 4RDM because it will slightly complicate the assignation of elements (I would have to add some extra conditionals or make the function longer), so since the 3RDM does not occupy to much memory compared to the 4RDM I just add the option to compute only the 3RDM or both of them.

Evaluating the 4RDM increases the computation time significantly. To be specific, while evaluating the 1-,2-,3RDM for H8 can take 5 seconds, the computation time including the 4RDM can take 8 minutes.
Additionally, as expected, since the 4RDM is a 8-index tensor, it requires a lot of memory. For example, for H8 and BH3 in STO-6g it requires ~70GB, for H10 in the same basis it requires ~250GB.

The most expensive part of the code both in performance and memory usage is the symmetrization of the 4RDM. This object has 12 symmetries for the upper indices and 12 symmetries for the lower indices (plus the antisymmetries), so I has a total of 24 symmetrization operations done with numpy einsum. When doing those symmetrizations with einsum, it seems to be pretty slow, however I think einsum is the faster and simpler way to do it. I am testing right now if adding an 'optimize' argument to the einsum helps with the performance

Besides from that issue, everything seems to be working correctly

DanielCalero1 · 2025-04-02T20:23:30Z

I tested the numpy.einsum operations with different 'optimize' arguments and it does not help. I've been reading and it seems that for an array as big as the 4RDM might be better to use numpy.transpose instead of numpy.einsum, as the transpose does not create a copy of the array but just a view of it in the operation. I don't know if that will then affect performance, I am going to check that and then post in here what I found to be the best approach.

DanielCalero1 · 2025-04-03T02:38:24Z

I did some test using numpy.tranpose instead of numpy.einsum. It is significantly slower and additionally for some reason it also has some memory problems. Normally numpy transpose returns only a view of the transpose array, however when the array is too big (as in the 4RDM) it sometimes see the need to create a copy. When a copy is created this functions results in a lot of cache misses strongly decreasing performance.

For that reason, I think that using numpy einsum would be the best performance we can get from the python side, at least without adding any external package.

So, I'll leave the code as it is right now.

PaulWAyers · 2025-04-03T14:34:39Z

@DanielCalero1 when you have a seniority-zero RDM, do you have the option of storing only the unique and/or non-zero elements in some sensible way? It obviously reduces the memory (by a square root) and the operators (also by a square root, I think).

DanielCalero1 · 2025-04-03T15:21:11Z

Well, in the C++ side I am compute directly the seniority-zero elements of the 4RDM, that I store in different vectors (e.g. d1 = $\left<pp|qq\right>$, d2 = $ \left<pq|pq\right>$, d3= $\left<pqr|pqr\right>$, etc ). Those vectors are pretty fast and they do not occupy that much memory. I then use those vectors to create the big 4RDM tensor (in the Python side).

The problem is that d1,d2,.. they have different structures, some of them are matrices, other 3-index tensors, other 4-index tensors.

So yes, I can store only the non-zero elements, but they are in principle in different objects. Additionally, even if I do a tensor product with identity to leave all the objects with the same number of indices and add them up, I don't know if I will end up with an object with the appropriate structure to use for other computations. (at least on my side, the CT commutators force the 4 RDM to have eight indices, so I wouldn't know how to use an object with just the non-zero elements of the 4RDM).

I would need to think more about it.

The other idea, as I mentioned you the other day, would be to compute the spin-traced RDM directly from the d1,d2,d3,..etc vectors. I know that, since the 4RDM is an 8 index tensor where each index runs over 2*n_orbital values, and the spin-traced 4RDM is an 8 index tensor where each index runds over n_orbtial values, then the spin traced 4RDM is $2^8$ times smaller. However, I am not really sure on how to do this.

PaulWAyers · 2025-04-03T23:11:04Z

Perfect. If the DM is seniority-zero I guess we need to either return only the special blocks (but that would be a special function with its own API) or return the 4DM as a sparse matrix/object. It's worth thinking about but not, for now, getting bogged down in.

msricher · 2025-04-07T14:56:02Z

Yep, I think this is fine for now, I purposely took the approach of only storing the unique spin-blocks of the RDMs so that other users can expand it out into a (giant) spin-resolved N-RDM whenever they like.

PaulWAyers · 2025-04-07T20:21:47Z

Sounds good. I will let @msricher or @DanielCalero1 do the final work of merging.

msricher · 2025-04-07T20:24:41Z

It looks fine to merge once the conflicts are fixed.

DanielCalero1 · 2025-04-07T20:36:37Z

I can fix the conflicts.

DanielCalero1 · 2025-04-07T21:03:36Z

I fixed the conflicts. The test is failing cause is not able to allocate enough memory for the test. What are your suggestion in this case @msricher. I tested in compute Canada several times and it worked, however I don't know how to do it here if there is not enough memory.

msricher · 2025-04-08T14:03:45Z

Okay. Can you add a Pytest mark to these tests (https://gist.github.com/devops-school/c0b260e7b845dff98556511071d0bf7c#8-marking-tests), maybe a "bigmem" mark, which is not enabled by default?

msricher · 2025-04-08T14:05:27Z

https://docs.pytest.org/en/7.1.x/example/simple.html#control-skipping-of-tests-according-to-command-line-option

DanielCalero1 · 2025-04-08T15:32:10Z

Great! I add a new file to configure the mark "bigmem" such that the test for the function is normally skipped unless specified ("pytest --bigmem").
I think on also modifying the Makefile such that the test with the "bigmem" arg can be call from it. I have two options, one is adding an extra argument to the test rule in the Make file

.PHONY: test test: @set -e; $(PYTHON) -m pytest -sv ./pyci $(PYTEST_ARGS)

Then to execute the test would be something like "make test PYTEST_ARGS="--bigmem" "

The other options is just create a new rule:

.PHONY: test-bigmem test-bigmem: @set -e; $(PYTHON) -m pytest -sv ./pyci --bigmem

Then to execute the test would be just "make test-bigmem"

Which one would you prefer @msricher ?

msricher · 2025-04-09T15:25:58Z

Thanks! I really don't have a preference so long as it works with Github Actions. The Makefile dispatchers are just for really quick convenience.

DanielCalero1 · 2025-04-10T13:39:35Z

Great! I update then the test to check for the compute_rdm_1234 just in case it is specified (make test-bigmem). I just tested again in Compute Canada and it is working fine. Additionally, the tests here seems to be working. I think the PR is ready to merge.

PaulWAyers

@msricher if you think it is ready to merge then either you or @DanielCalero1 can do it.

DanielCalero1 added 25 commits December 3, 2024 20:15

Adding and computing d3 and d4 matrices for the 3rdm

94ff745

Updatind python binding in rmd.cpp

bd91d4b

Updating pyci.h

5fe3111

Updating spinze_rdms to compute DOCI 3rdm

af06c92

Adding spin_free_rdms function

453e6b3

Updating __init__.py to include spin_free_rdms function

5403be9

Separating the previous function compute_rdms from the new compute_rd…

a719ae5

…ms_34

Updating bindings

4e5be15

Correcting typo in rdm.cpp

17e31bc

updating pyci.h to include compute_rdms_34 function

34ac07c

Including function compute_rdms_34 in __init__.py

6bdb6a6

Separating functions spinize_rdms and spinize_rdms_34

998e0bd

Creating new tests routines for compute_rdms_34 and spinize_rdms_34

0cee298

Improving code style

6267ab9

Changing function name to compute_rdms_1234

ca1f3af

First attempt to include 4RDM elements in compute_rdms_1234

b137aae

Chaing python biding to receive 4RDM vectors

b97f92a

Adding H8 electron integrals to test 4RDM

4707357

Adding new test file(delete later)

acb3b97

Adding computation of general 4RDM with the appropiate symmetries

1ee1e25

Adding new test_routines for the 4RDM

80df38e

Modifying spin-free RDMs function

490ef93

Modyfing _init_ function to include spinize_RDM_1234 and spin_free_rdms

0441aaf

Updating the test rotine for the 4RDM

655bf73

Deleting auxiliary test file

e74e342

DanielCalero1 requested review from msricher and PaulWAyers March 25, 2025 18:40

DanielCalero1 closed this Mar 25, 2025

DanielCalero1 added 4 commits April 2, 2025 04:26

Fixing typo in d7 definition

560e785

Fixing mixed-spin block definitions

ca23e4d

Adding option to select if compute 4rdm or not

32b92a7

Fixing typo in test routine

e4caf7a

DanielCalero1 reopened this Apr 2, 2025

Merge branch 'master' into High-order-RDM

4960628

setting up bigmem mark

24a1fb0

Updating Makefile to ignore 34 RDM test unless specified

87a3ba9

PaulWAyers approved these changes Apr 10, 2025

View reviewed changes

msricher merged commit b85fc93 into master Apr 10, 2025
1 check passed

DanielCalero1 mentioned this pull request Apr 10, 2025

DOCI 4RDM #85

Closed

4RDM implementation #105

4RDM implementation #105

Uh oh!

Conversation

DanielCalero1 commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DanielCalero1 commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

msricher commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DanielCalero1 commented Apr 2, 2025

Uh oh!

DanielCalero1 commented Apr 2, 2025

Uh oh!

DanielCalero1 commented Apr 3, 2025

Uh oh!

PaulWAyers commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DanielCalero1 commented Apr 3, 2025

Uh oh!

PaulWAyers commented Apr 3, 2025

Uh oh!

msricher commented Apr 7, 2025

Uh oh!

PaulWAyers commented Apr 7, 2025

Uh oh!

msricher commented Apr 7, 2025

Uh oh!

DanielCalero1 commented Apr 7, 2025

Uh oh!

DanielCalero1 commented Apr 7, 2025

Uh oh!

msricher commented Apr 8, 2025

Uh oh!

msricher commented Apr 8, 2025

Uh oh!

DanielCalero1 commented Apr 8, 2025

Uh oh!

msricher commented Apr 9, 2025

Uh oh!

DanielCalero1 commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PaulWAyers left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

DanielCalero1 commented Mar 25, 2025 •

edited

Loading

DanielCalero1 commented Mar 25, 2025 •

edited

Loading

msricher commented Mar 25, 2025 •

edited

Loading

PaulWAyers commented Apr 3, 2025 •

edited

Loading

DanielCalero1 commented Apr 10, 2025 •

edited

Loading