-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scraper candidate: City of Chicago Landmarks database #8
Comments
The datasets are different for some reason. We should figure out who the researchers are at DHED that know about these. Juan-Pablo Velez On Friday, January 11, 2013 at 4:43 PM, Derek Eder wrote:
|
Yes, these are different datasets. The "historical landmarks" first reffed above are the same items in the dataset at the bottom of above: https://data.cityofchicago.org/Historic-Preservation/Individual-Landmarks-Shapefiles/2h6e-2yk6 These are "a list of individual Chicago Landmarks designated by City Council upon recommendation of the Commission on Chicago Landmarks". In other words, they went through a formal process for designation and made the final cut. The data from this process is the scrapable PDFs of monthly meeting minutes published by the Commission, five years of which are published here. This is a good candidate for scraping-- well-formed addresses with large blocks of descriptive narrative associated with each address. Very rich information that can inform decision-making in the future. I will add that in a separate issue. Though there may be internal documents of the Commission that are more structured than these meeting meeting minutes, it's not likely that the City would ever go back and attempt to turn these PDFs into publishable datasets on the data portal. There are far more worthwhile dataset candidates than this one. However, turning these meeting minutes into structured data might be a good project for a non-programmer to get involved in edifice. A tool like http://tabula.nerdpower.org/ wouldn't really work, because it's not tabular to begin with. At EveryBlock, we had a custom tool for doing this (pull in text, highlight proposed addresses and blocks of text associated with it, allow a human to confirm/ fix, and publish). See screenshot. It seems like it would be a good thing to do that in this project. Anyone want to make that? The middle item reffed above are all items from the "inventory of architecturally and historically significant structures". This is a completely separate dataset, and super-useful to this project. Added that as #26 (could someone with access please add the "scraper" label to that issue?) |
The City of Chicago has an online tool for looking up historical landmarks. These should be pretty easy to scrape.
A couple hundred historical landmarks with descriptions and images:
http://webapps.cityofchicago.org/landmarksweb/web/listings.htm
A database of 17,000 Chicago buildings including address, architect, type, color code, major tenant (probably outdated), and PIN.
Selecting a blank value for Architect will return the whole list (I think)
http://webapps.cityofchicago.org/landmarksweb/search/home.htm
This may have already been released as this dataset:
https://data.cityofchicago.org/Historic-Preservation/Individual-Landmarks-Shapefiles/2h6e-2yk6
https://data.cityofchicago.org/Historic-Preservation/Individual-Landmarks/tdab-kixi
The text was updated successfully, but these errors were encountered: