Pull meta descriptions for each dataset #24

derekeder · 2013-02-13T06:13:57Z

The City does a pretty good job of documenting what each of these datasets are. Can these be pulled from the API?

Example: https://data.cityofchicago.org/Buildings/Building-Footprints/w2v3-isjw

Also, they usually link to a document that defines all the data fields and what they mean. Can we somehow either link to or read this in?

Example: https://data.cityofchicago.org/api/assets/003C600C-3A66-4605-8E7E-2477AAE95E16

mccc · 2013-02-13T06:23:35Z

It would be cool if we could automatically pull those PDFs, but they seem to usually be linked in free-form 'Description' text fields. It might just involve some super-simple scanning for bit.ly links, if that's the city's standard practice over hundreds of datasets.

Re: scraping that document — if only there was a schema for the data dictionary for my schema... [I think this is what Heidegger called "the hermeneutic circle"]

danxoneil · 2013-02-15T05:48:39Z

Unfortunately, a scan for bit.ly links would not be comprehensive. Here's a example: https://data.cityofchicago.org/Health-Human-Services/Public-Health-Statistics-Gonorrhea-cases-for-femal/cgjw-mn43?.

This is a dataset currently included in this project.

Perhaps just follow and slurp all of the links in each metadata field, them associate them with that dataset, and sort it out by hand later?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pull meta descriptions for each dataset #24

Pull meta descriptions for each dataset #24

derekeder commented Feb 13, 2013

mccc commented Feb 13, 2013

danxoneil commented Feb 15, 2013

Pull meta descriptions for each dataset #24

Pull meta descriptions for each dataset #24

Comments

derekeder commented Feb 13, 2013

mccc commented Feb 13, 2013

danxoneil commented Feb 15, 2013