Skip to content

Latest commit

 

History

History
20 lines (15 loc) · 1023 Bytes

Readme.md

File metadata and controls

20 lines (15 loc) · 1023 Bytes

Level up your web scraping skills: Extracting 2020 presidential election votes from news images

This is the code from the article I published on Medium

It describes a method to retrieve the number of votes as published on CNN live story hour by hour.

You can find a bash script to download the Json files from the live story news site. Those Json files are also provided in this repository, they contain links to the images I am processing in the article (or you can download the images there)

Have a look to the Jupyter notebook to get more insight on the python code.

Python requirements:

  • pandas
  • scikit-image
  • Pillow
  • tqdm
  • requests
  • ipyplots
  • pytesseract
  • matplotlib