-
Notifications
You must be signed in to change notification settings - Fork 16
Memory issue when root and sub-children are selcted at the same time #369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Yogu
wants to merge
3
commits into
refactor-traversal
Choose a base branch
from
memory-issue-root-and-sub-children
base: refactor-traversal
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
c3e799f to
1d8ebc9
Compare
ec371a7 to
5502368
Compare
1d8ebc9 to
6098d71
Compare
5502368 to
265b74c
Compare
6098d71 to
bc56d66
Compare
433ce56 to
b60bc9f
Compare
bc56d66 to
e2af2aa
Compare
3a0d319 to
b338b0c
Compare
Member
Author
|
This PR is now finalized, but it depends on #365, so we should wait until that is merged before reviewing this one here probably. (also, the target branch of this PR is currently the branch of !365) |
6910ac8 to
c023b92
Compare
b338b0c to
bac48ec
Compare
c023b92 to
7f7ce45
Compare
If you access an outer variable (like a root variable) in a loop, and the loop also has another subquery, the outer variable will be copied in memory for each subquery instance, so for each outer loop iteration. This is a known limitation of ArangoDB. These tests cover some cases where this is the case. Future commits will improve some of these cases.
We optimized AQL generation of TraversalQueryNode so it e.g. uses array expansions instead of subqueries, and they are also relevant for regular list fields (child entities, value objects, scalars, enums) This is especially important when they are queried alongside a root field because subquery nodes sometimes hold a copy of all variables used in sibling nodes. References have a memory usage regression because their variable is no longer pulled up. This will be fixed in the next commit. Some tests where IS_LIST(...) ? ... : [] was replaced by ...[*] twice for the same field show increased memory usage. This is because ArangoDB deduplicated the CalculationNode using the IS_LIST approach, but no longer does this with the [*] approach. In other cases (when there is a CalculationNode anyway), the [*] approach reduces memory. As the [*] approach also makes for clearer queries, and there is no clear winner (without making very strong assumptions about ArangoDB's optimization and also doing a lot of analyzing work), we don't introduce a special case to use IS_LIST to getFieldTraversalFragment() fow now.
…ry usage ArangoDB also has logic to hoist variable assignments if it can, but it sometimes (or rather often in our cases) pushes them down again. This is problematic if the result of the assignment is much smaller than its dependencies. In the case of root entities that are accessed via @root, this is often the case because usually, only a subset of the root's fields are accessed within the loop, but a lot more fields are accessed on root level. We can prevent ArangoDB from pushing down the variables again by wrapping their assignment in NOEVAL(...) If a query requests many fields from the root object in general (by default 5), the reduce-extraction-to-projection optimization is no longer applied so the full root object is held in memory, and without variable hosting, can be duplicated for each result item of a traversal, leading to very high memory usage (root entity size * number of collect items) In the regression test root-fields/root-with-collect in query q, the memory usage increased slightly. This is likely because the added variable assignments have an overhead that is not offset by the hoisting because of the small number of items and small size of objects in the regression tests.
bac48ec to
1e15e56
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.