-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve code utilization functionality #149
Comments
My two cents is that the normalized version - i.e. the one divided by the total number of platforms - is probably more informative at a glance, given that its range is |
This was the way that I was leaning, as well, so I'm happy to align everything around this. But it does give rise to a second question, which is whether we should include "unused" code in our measure of code utilization, or not. This is probably best demonstrated with a small example, see below. Consider this toy example, compiled with
The way we count SLOC, we have:
i.e., 3 lines used by 2 platforms, 2 lines used by 1 platform, and 2 lines used by 0 platforms. There are a few ways to derive what we're currently calling Code Utilization, but for purposes of exposition I'm going to write it as a sum over "Fraction of Code" x "Fraction of Platforms". If we compute Code Utilization including the "unused" lines, we have: (3/7 x 2/2) + (2/7 x 1/2) + (2/7 x 0/2) = 0.57. Note that the last term will always be zero (because the number of platforms using unused code is always 0), but that the presence of the "unused" lines affects the denominator in the earlier terms. If we compute Code Utilization excluding the "unused" lines, we have (3/5 x 2/2) + (2/5 x 1/2) = 0.8. The main difference here is that the inclusive version will be 1.0 iff every line of code is used and every line of code is used by all platforms, whereas the exclusive version will be 1.0 if the platforms all use the same code. My current inclination is to exclude the unused lines, but provide an additional measure of the "unused" code in the output. So you would see something similar to the below (where I'm deliberately not using the word "utilization", for reasons that will become apparent):
(Aside: This suggests our metric should be called something like "Code Sharing" instead of "Code Utilization"). Compare that with the current output:
I feel like the first table is more intuitive, and it's more obvious that there are really two things to try and maximize. With your proposed What do you think? |
Feature/behavior summary
We need to add documentation for the new "code utilization" functionality, and decide which functionality to expose.
We currently provide two functions:
code_utilization
; andnormalized_utilization
...but the tree interface displays the result of
normalized_utilization
under the heading "Code Utilization".It is unclear whether discussion of "Code Utilization" in the documentation should describe the value computed by
code_utilization
ornormalized_utilization
, and having two variants is likely to confuse users.Request attributes
Related issues
No response
Solution description
Decide on what we want code utilization to mean, and document it accordingly.
Additional notes
No response
The text was updated successfully, but these errors were encountered: