Datasets that need more aggregated columns #846

brockfanning · 2017-12-05T04:38:39Z

Looking forward to the possibility of showing disaggregations on indicators, I notice some datasets need additional aggregated columns. These datasets contain the disaggregated values (which is a great start!) but may be missing some overall sum/average/etc columns. Here is the list I came up with:

https://sdg.data.gov/2-2-2/
- includes wasting and overweight, but needs an "all" column
https://sdg.data.gov/3-7-2/
- includes ages 10-14 and 15-19, but needs an "all" column
https://sdg.data.gov/4-1-1/
- needs these columns:
  - an "all" column
  - a "reading_all" column
  - a "math_all" column.
https://sdg.data.gov/4-2-1/
- ncludes many disaggregations already, but needs an "all" column
https://sdg.data.gov/4-3-1/
- includes many disaggregations already, but needs an "all" column
https://sdg.data.gov/4-5-1/
- includes many disaggregations already, but needs an "all" column
https://sdg.data.gov/4-6-1/
- same as 4-5-1 above
https://sdg.data.gov/4-c-1/
- includes each level of certification, but needs an "all" column
https://sdg.data.gov/5-4-1/
- includes many disaggregations already, but needs these columns:
  - an "all" column
  - columns for each age value (7 columns)
  - columns for each gender value (2 columns)
  - columns for each activity value (4 columns)
https://sdg.data.gov/5-b-1/
- includes male and female, but needs an "all" column
https://sdg.data.gov/8-5-1/
- includes many gender/age combinations, but needs these columns:
  - an "all" column (1 column)
  - columns for each age value (9 columns)
https://sdg.data.gov/8-5-2/
- includes all gender/able-bodiedness/age combinations already, but also needs these aggregated columns:
  - "all" column for the whole population (1 column)
  - columns for each age value (10 columns)
  - columns for each gender value (2 columns)
  - columns for each able-bodiedness value (2 columns)
https://sdg.data.gov/8-8-1/
- includes all fatality/gender combinations, but needs these aggregated columns:
  - "all" column (1 column)
  - columns for each fatality value (2 columns)
  - columns for each gender value (2 columns)
https://sdg.data.gov/8-a-1/
- includes columns for commitments and disbursements, but needs an "all" column
- or alternatively if that doesn't make sense, make commitments vs. disbursements a unit of measurement?
https://sdg.data.gov/9-1-2/
- freight vs. passenger will be units of measurement, but this still needs 2 general columns: "freight_vol_all" and "pass_vol_all".
https://sdg.data.gov/16-1-1/
- has good disaggregation but the disaggregated columns need to use the same units of measurement
https://sdg.data.gov/17-6-2/
- includes columns for three speeds, but needs an "all" column

@Kali2017SDG I may be off-base with the above, but I think it's worth checking into. Do you have any thoughts? If it helps, I could create each item as a separate Github issue, and we could loop in the data provider for that particular indicator.

brockfanning · 2017-12-06T11:27:58Z

It may be worth considering: can these aggregate columns be computed, rather than manually entered? This may depend on the indicator, but presumably any sum or average aggregates could be computed by the platform, which would save work for the data providers.

To take an example, the first one: https://sdg.data.gov/2-2-2/ which needs an "all" column. The platform could easily add the 2 types of malnutrition (0.6% wasting and 8.1% overweight) together to get an "all" column of 8.7%.

JenPark9 · 2017-12-08T15:49:42Z

Thank you, Brock. I think for the most part, we will need to ask the data providers for the suitable "total" statistic. Some of the categories are not things that conceptually should be totaled--for example, adding wasting and overweight in 2.2.2. I think this is something that Kali could reach out to individual data providers about.
Let me say I really appreciate your looking at this so closely and coming up with next steps to try!

brockfanning · 2017-12-08T16:29:44Z

@JenPark9 Sounds great, thank you!

brockfanning changed the title ~~Datasets that could use more aggregated columns~~ Datasets that need more aggregated columns Dec 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datasets that need more aggregated columns #846

Datasets that need more aggregated columns #846

brockfanning commented Dec 5, 2017 •

edited

Loading

brockfanning commented Dec 6, 2017

JenPark9 commented Dec 8, 2017

brockfanning commented Dec 8, 2017

Datasets that need more aggregated columns #846

Datasets that need more aggregated columns #846

Comments

brockfanning commented Dec 5, 2017 • edited Loading

brockfanning commented Dec 6, 2017

JenPark9 commented Dec 8, 2017

brockfanning commented Dec 8, 2017

brockfanning commented Dec 5, 2017 •

edited

Loading