Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasets that need more aggregated columns #846

Open
brockfanning opened this issue Dec 5, 2017 · 3 comments
Open

Datasets that need more aggregated columns #846

brockfanning opened this issue Dec 5, 2017 · 3 comments

Comments

@brockfanning
Copy link
Contributor

brockfanning commented Dec 5, 2017

Looking forward to the possibility of showing disaggregations on indicators, I notice some datasets need additional aggregated columns. These datasets contain the disaggregated values (which is a great start!) but may be missing some overall sum/average/etc columns. Here is the list I came up with:

  1. https://sdg.data.gov/2-2-2/
    • includes wasting and overweight, but needs an "all" column
  2. https://sdg.data.gov/3-7-2/
    • includes ages 10-14 and 15-19, but needs an "all" column
  3. https://sdg.data.gov/4-1-1/
    • needs these columns:
      • an "all" column
      • a "reading_all" column
      • a "math_all" column.
  4. https://sdg.data.gov/4-2-1/
    • ncludes many disaggregations already, but needs an "all" column
  5. https://sdg.data.gov/4-3-1/
    • includes many disaggregations already, but needs an "all" column
  6. https://sdg.data.gov/4-5-1/
    • includes many disaggregations already, but needs an "all" column
  7. https://sdg.data.gov/4-6-1/
    • same as 4-5-1 above
  8. https://sdg.data.gov/4-c-1/
    • includes each level of certification, but needs an "all" column
  9. https://sdg.data.gov/5-4-1/
    • includes many disaggregations already, but needs these columns:
      • an "all" column
      • columns for each age value (7 columns)
      • columns for each gender value (2 columns)
      • columns for each activity value (4 columns)
  10. https://sdg.data.gov/5-b-1/
    • includes male and female, but needs an "all" column
  11. https://sdg.data.gov/8-5-1/
    • includes many gender/age combinations, but needs these columns:
      • an "all" column (1 column)
      • columns for each age value (9 columns)
  12. https://sdg.data.gov/8-5-2/
    • includes all gender/able-bodiedness/age combinations already, but also needs these aggregated columns:
      • "all" column for the whole population (1 column)
      • columns for each age value (10 columns)
      • columns for each gender value (2 columns)
      • columns for each able-bodiedness value (2 columns)
  13. https://sdg.data.gov/8-8-1/
    • includes all fatality/gender combinations, but needs these aggregated columns:
      • "all" column (1 column)
      • columns for each fatality value (2 columns)
      • columns for each gender value (2 columns)
  14. https://sdg.data.gov/8-a-1/
    • includes columns for commitments and disbursements, but needs an "all" column
    • or alternatively if that doesn't make sense, make commitments vs. disbursements a unit of measurement?
  15. https://sdg.data.gov/9-1-2/
    • freight vs. passenger will be units of measurement, but this still needs 2 general columns: "freight_vol_all" and "pass_vol_all".
  16. https://sdg.data.gov/16-1-1/
    • has good disaggregation but the disaggregated columns need to use the same units of measurement
  17. https://sdg.data.gov/17-6-2/
    • includes columns for three speeds, but needs an "all" column

@Kali2017SDG I may be off-base with the above, but I think it's worth checking into. Do you have any thoughts? If it helps, I could create each item as a separate Github issue, and we could loop in the data provider for that particular indicator.

@brockfanning brockfanning changed the title Datasets that could use more aggregated columns Datasets that need more aggregated columns Dec 6, 2017
@brockfanning
Copy link
Contributor Author

It may be worth considering: can these aggregate columns be computed, rather than manually entered? This may depend on the indicator, but presumably any sum or average aggregates could be computed by the platform, which would save work for the data providers.

To take an example, the first one: https://sdg.data.gov/2-2-2/ which needs an "all" column. The platform could easily add the 2 types of malnutrition (0.6% wasting and 8.1% overweight) together to get an "all" column of 8.7%.

@JenPark9
Copy link

JenPark9 commented Dec 8, 2017

Thank you, Brock. I think for the most part, we will need to ask the data providers for the suitable "total" statistic. Some of the categories are not things that conceptually should be totaled--for example, adding wasting and overweight in 2.2.2. I think this is something that Kali could reach out to individual data providers about.
Let me say I really appreciate your looking at this so closely and coming up with next steps to try!

@brockfanning
Copy link
Contributor Author

@JenPark9 Sounds great, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants