-
Notifications
You must be signed in to change notification settings - Fork 402
Tweak CArena Defragmentation Strategy #4531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Conversation
The previous strategy has a flaw. Suppose a CArena's initial size is small and we have n vectors each with a size of x. Now we are resizing these vectors one by one to size x+y, where y << x. Then we would end up with n new allocations each with a size of 2*x+y. We have doubled the memory usage in the end, because the unused spaces can not be combined. In the new strategy, we only attempt to combine uncombined allocations when the combined amount is not less than the requested amount of allocation. We also check the malloc error code now. If it fails, we will try to free more memory and call malloc again.
f021d7c
to
7ae5513
Compare
Pseudocode of the operation: x = // big
y = // small
for (n) {
alloc(x);
}
for (n) {
// like vector resize
alloc(x+y);
free(x);
}
// test memory here
for (n) {
free(x+y);
} What I think the various options should end up with:
Current defrag:
This PR:
Current defrag, but with
So I think the last option should be best. |
WarpX AMD MI300A (128GB) benchmarks with
|
@AlexanderSinn But |
@ax3l I thought with development your test ran out of memory. |
Yes, on 512 nodes. I am good on 1 node. This test is on one node. |
I added the number of calls to all but the first entry use init size of 1
|
The previous strategy added in #4451 has a flaw. Suppose a CArena's initial size is small and we have n vectors each with a size of x. Now we are resizing these vectors one by one to size x+y, where y << x. Then we would end up with n new allocations each with a size of 2*x+y. We have doubled the memory usage in the end, because the unused spaces can not be combined.
In the new strategy, we only attempt to combine allocations when the combined amount is not less than the requested amount of allocation.
We also check the malloc error code now. If it fails, we will try to free more memory and call malloc again.