Skip to content

Latest commit

 

History

History
28 lines (18 loc) · 3.07 KB

README.md

File metadata and controls

28 lines (18 loc) · 3.07 KB

Advanced Census data with Python

A workshop/tutorial developed for NICAR 2020.

In this workshop, we'll work through the multi-step process of aggregating data from the US Census American Community Survey (ACS) to locally important geographies: specifically, Chicago community areas.

The workshop is designed as a Jupyter Notebook to support self-paced learning and future reference. To begin, you'll need to set up a Python environment using either Poetry or pip install -r requirements.txt. If you really don't know what any of that means, you may not be ready for this workshop, because it's more about showing how to do things without a lot of explaining.

NOTE: Some of the requirements may also depend on non-python software installations. If you can help, feel free to add notes to this README and make a pull-request. I've looked a bit into making this run on MyBinder.org but haven't had time to make that fully-functional.

Python Census libraries you can use:

In preparing this, I gathered a list of Python libraries designed to help with Census data. Not all of them are covered in theis repository, but I figured I'd keep the list anyway, in case it's useful. Feel free to send pull requests if I missed any!

Getting data

  • census: A simple wrapper for the United States Census Bureau's API. Provides access to ACS, SF1, and SF3 data sets.
  • census-area: This Python library extends the the above Census API Wrapper to allow querying Census tracts, block groups, and blocks by Census place, as well as by arbitrary geographies. Extremely slow -- N+1 query problem
  • cenpy:
  • CensusData: A Python wrapper to the Census API
  • census-data-downloader: From the LA Times Datadesk, designed mostly as a command line tool to download data sets (but can be used as a Python library)
  • autocensus: Python package for collecting American Community Survey (ACS) data from the Census API, along with associated geospatial points and boundaries, in a pandas dataframe. Uses asyncio/aiohttp to request data concurrently. From the maintainers, "This package is under active development and breaking changes to its API are expected."

Working with data

  • census-data-aggregator: Another LAT Datadesk joint, "Combine U.S. census data responsibly"
  • census-error-analyzer: yet another from LAT, this one helps test to see if values are statistically different when taking the margin of error into account.
  • censusgeocode: wraps the Census geocoder API. Can geocode single addresses or in batch, both in Python code and as a command line tool.