-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarks for Parsing and caching auto generated XForms #454
base: master
Are you sure you want to change the base?
Conversation
Wow, there's a lot here, @JohnTheBeloved! Can you please describe your high-level approach and what decisions you made along the way? For example, why did you choose to generate main instances with the full range of possible controls and types? Do you expect there to be a performance difference between them? If so, then how can results between runs be compared? Broadly, it seems to me that in this context, the forms that benchmarks are run over need to be known and stable for results to be comparable. That is, dynamically generating a certain number of elements of a predictable structure seems very valuable. I'm less sure about what the randomness brings and am interested in hearing more about why you chose that design. |
Representing the This PR does not cover all the parts of the specification, rather the major parts of the xform structure which can be used to substantiatively verify some aspects of JavaRosa performance - in this case 1. Creating FormDef and 2. Caching the FormDef. However, the synthetic XForm Builder I implemented wasn't limited to our current benchmarks in mind (which is reason you see other control types defined- they can be removed). Only 2 control types were used for this benchmarks
I am not assuming that I am very experienced in most parts of the XForm structure and also that this PR covers all possible parts of the XForm Definition (XFormSpec) The implementation was rather to quickly analyse benchmarks for different complexities of an XForm - complexity in in terms of
The Builder Design pattern was employed in creating the different parts of the XML strings of the XForm Structure
Some of the areas not implemented into the
As you can see below, this summarises the XFormBuilder build flow for easier maintainability
each build TO answer your questions directly Why did you choose to generate main instances with the full range of possible controls and types? Other control types were defined in case there is a future need for them to be used, they are not being used currently apart from the select and input text The select was specifically used because the time taken to populate the the options of a question was put into consideration both when running the benchmarks and during manual testing of the forms Do you expect there to be a performance difference between them? Other than considering time to populate the options, NO , also the relative performance between control types is not what was intended. I'm less sure about what the randomness brings and am interested in hearing more about why you chose that design By Random, you mean the options of the select control, It's being used to pick any of the internal instances randomly. |
The snippet below defines the benchmark parameters for running possible combinations that define an XForm. In this case benchmarks
The Benchmark would be run for combinations of the products of the parameters.
|
Thank you for the extra context. In general, the simpler the approach and code, the more likely it is someone else can give meaningful feedback and eventually build on it, especially when there are no comments or context in commits. I completely understand wanting to plan ahead and it's really cool that you dove deep into generating forms. That said, it would really have helped to have as narrowly-scoped of a PR as possible. At a high level, the We talked about this briefly last week but I want to confirm one more time. The results you are getting out of these benchmarks are correlated with the results you have gotten in profiling sessions on real Android devices running Collect, right? That is, if these benchmarks show a linear relationship between number of elements in a secondary instance and time taken to load from cache, it's also roughly linear in profiling? I say roughly because the benchmarking environment is much more controlled but if somehow the shape of the curve is different, there's a problem. I can't imagine what would lead to that but stranger things have happened.
Are you saying you actually measured a difference between having selects and other question types in the primary instance? That is surprising to me because I don't believe choices are populated from secondary instances until the question is displayed. Either way, this assumption and its implications should be documented explicitly in the code or in a commit message. Focusing on the three benchmark classes listed above, could you please write a little about how you picked the parameter values you used? Do you feel like all combinations in the cross product add value? What relationships are you trying to identify and would it be possible to identify them with fewer cases? The 10 instances with 50000 elements each case in particular seems unrealistic and time consuming to run. Is it possible to give a CSV of input values like with junit to avoid the full cross product? |
build.gradle
Outdated
@@ -77,15 +79,15 @@ jmh { | |||
exclude = "(BenchmarkTemplate)" | |||
threads = 1 | |||
fork = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What led to these changes? In particular, I thought you ended up with a high number of warmup iterations because the results weren't stable otherwise. Is that not the case?
@Param({"0"}) | ||
public int noOfQuestionGroups = 1; | ||
@Param({"0"}) | ||
public int noOfExternalSecondaryInstances = 50; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the explicit value that the parameters are set to are ignored in favor of the injected values. Is there a reason to have those explicit values? It would be clearer to just declare the fields without initializing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was set in other to debug and test as a java class without running the benchmarks,
381d023
to
c38e46b
Compare
…alization/Deserialization of FormDef to cache
… benchmark_refactor
Codecov Report
@@ Coverage Diff @@
## master #454 +/- ##
============================================
- Coverage 49.03% 49.02% -0.02%
+ Complexity 2940 2939 -1
============================================
Files 246 246
Lines 13698 13698
Branches 2650 2650
============================================
- Hits 6717 6715 -2
Misses 6132 6132
- Partials 849 851 +2
Continue to review full report at Codecov.
|
Hi @lognaturel I initially created the draft to give to an idea of how I implemented the I have made a couple of clean ups which should make things a bit more clearer. As for the comparison between the benchmarks and profilings, Yes they are commensurate, Do you need some data to back that up? I just assumed the options were used during the Question creation in a way, I have remove the select list control, so we just have The parameters that were used categorised the Questions, and Options to
I don't think it we have to benchmark only real world parameters at all times, The outrageous parameters also serve as a means of a stress test for the methods and also to see where efficiency begins to drop. |
That's great. If you've taken a look and confirmed it, nothing more needed, I think. I'll take another pass through soon. Overall my goal is not to be too picky about this now. We can look at individual benchmarks in more detail as they're being used to evaluate possible changes. |
Closes #
It's an improvement of the recent Benchmarking PR and intended to run Benchmarks on Synthetic/Dynamic forms instead of static forms
What has been done to verify that this works as intended?
Benchmarks have being run on travis CI and results
Why is this the best possible solution? Were any other approaches considered?
It's still a Work in progress, so suggestions are most welcome
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
Changes aren't part of the main sourceset so no user impact is expected
Do we need any specific form for testing your changes? If so, please attach one.
Does this change require updates to documentation? If so, please file an issue here and include the link below.