CKAN Downloader for Amsterdam Data

This is a simple downloader for all data hosted through the Amsterdam city data portal (http://data.amsterdam.nl). Since this is a CKAN instance, we will use the CKAN API to retrieve the JSON description of the data, and use the url of the JSON object to download the data (if possible)

import json
import requests

Do an HTTP GET against the API to retrieve all datasets (limited to 10000, which is way more than the CKAN contains).
Take the JSON representation of the response and convert it to a Python dictionary.
Take the results element of the JSON object (a Python dictionary)

ckan_response = requests.get('http://data.amsterdam.nl/api/search/dataset?all_fields=1&offset=0&limit=10000')
ckan_json = ckan_response.json()
results = ckan_json['results']

Loop over all results
For every result, check whether it is in a format that we can understand
If so, retrieve it by doing a GET against the url of the resource
... and save it to the current directory.

for r in results:
    rj = json.loads(r['data_dict'])
    
    for resource in rj['resources']:
        if resource['format'] in ['JSON','api','XLS','CSV','ZIP'] :
            print u"Retrieving {}".format(resource['name'])
            resource_data_response = requests.get(resource['url'])
            resource_filename = u"{}-{}.{}".format(resource['id'],resource['name'].replace(' ','_'),resource['format'].lower())
            
            try :
                with open(resource_filename,'wb') as resource_file:
                    print u"Writing to {}".format(resource_filename)
                    resource_file.write(resource_data_response.content)
            except:
                print u"Error while writing {}".format(resource_filename)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CKAN Downloader for Amsterdam Data

Files

README.md

Latest commit

History

README.md

File metadata and controls

CKAN Downloader for Amsterdam Data