-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize parsing big xml files #5924
base: master
Are you sure you want to change the base?
Optimize parsing big xml files #5924
Conversation
memoize generated is/set/get method names. These methods are called many-many times, but there is very few unique property names among them.
... to a separate reusable method.
instead of ineffectively iterating existing methods one-by-one and generating is/get/set method names in the loop multiple times, hold methods in a Map which allow fast lookup by property name.
Hi @asolntsev, Thank you for creating this improvement PR, it's very appreciated it by the team. I left you a small question/comment I'll be also checking with the team and I'll let you know. In general, this PR looks good to me. I think I would only ask you if you can add some unit tests to Last thing, do you have a comparison from your use case that shows us how this improvement is behaving in numbers (how it's now vs how it was previously)? Thanks, |
…into optimize-parsing-big-xml-files
Thank you Daniel!
On my project and my laptop (MacBook pro M1), I run in parallel threads a suite of integration tests that run LiquiBase multiple times to initialize DB (DB is local in docker container). Execution time of integration test decreased by ~1 minute:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved.
Looks good to me. Most of the build and test checks are looking good (except for functional tests that are failing because of a non-related reason, same for test-harness but they should be ok in the next run).
Impact
Description
When running a big integration test suite that runs LiquiBase multiple times, I realized that LB spends a lot of time on generating the same method names again and again (e.g. given property name
name
, LB generates StringsisName
,getName
andsetName
many-many times).I suggest an optimization that computes them only once and holds in a Map.
Should make parsing big XML files faster and consume less memory.
I believe this change doesn't affect any functionality (besides the above-mentioned performance improvement).