-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct calculation of size of calculated_stats_
vector
#466
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced by all the hardcoded numbers, but if that's the best way to do it then it's ok.
Other than that, it all seems good.
src/HealthGPS/analysis_module.cpp
Outdated
// The number of factors to calculate stats for is the number of factors minus the length of the | ||
// `factors` vector. | ||
size_t num_stats_to_calc = context.mapping().entries().size() - factors_to_calculate_.size(); | ||
|
||
// But we calculate the mean and standard deviation of each factor, so we need to multiply by 2 | ||
num_stats_to_calc *= 2; | ||
|
||
// We also want to calculate the prevalence and incidence of each disease | ||
num_stats_to_calc += 2 * context.diseases().size(); | ||
|
||
// We also want to keep the count, deaths, and emigrations | ||
num_stats_to_calc += 3; | ||
|
||
// We also want to calculate the normal, overweight, obese, and above weight prevalence | ||
num_stats_to_calc += 4; | ||
|
||
// Finally, we want to calculate the mean and standard deviation of YLL, YLD, and DALY | ||
num_stats_to_calc += 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like there are a lot of hardcoded numbers here... Is there a way we could pass them in as various arguments so we don't have to remember to manually change them if we add/remove factors etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you're right it's a bit horrible isn't it. I'll have a think about a nicer way to do this. Eventually the other "channels" stuff will be removed and with it the reference to what the calculated stats (column headings) actually are, so it might be good to have a human-readable list somewhere perhaps stored as a std::vector<std::string>
as a data member in the class or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
So I did some bits, but then realised that all I was doing was simple duplicating what the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 😄
num_stats_to_calc += 6; | ||
// And for each factor, we calculate the stats described in `channels_`, so we | ||
// multiply the size of `channels_` by the number of factors to calculate stats for. | ||
num_stats_to_calc *= channels_.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much cleaner!
In #453 I introduced the
calculated_factors_
vector to store the calculated stats for each bin for each factor the user wants to calculate stats for. The size of this vector is calculated and set ininitialise_vector
method, but initially I only allowed one "channel" per factor. This PR fixes this by including the other "channels" in the vector resize calculation, namely those specified in the already-existinginitialise_output_channels
method:healthgps/src/HealthGPS/analysis_module.cpp
Lines 530 to 559 in 46677a6
I have also renamed the vector to
calculated_stats_
as it seems more appropriate.