Skip to content

Sample code from the work I produced as an intern at WRI in 2019 including web scraping and predictive modeling. The full codebase will be available open-source soon.

Notifications You must be signed in to change notification settings

aquevedo93/CO2_Emissions_PP

Repository files navigation

CO2 Emissions at the Power Plant Level

Throughout the summer of 2019, I was a Data Science Intern at the World Resources Institute (WRI)’s Climate Program supporting Power Explorer, a project seeking to provide open access to global data on power production and its impacts. This repository contains some sample code from the work I produced. The full codebase will be available open-source in the near future.

Web Scraping

Python scripts using web scraping techniques, including packages such as bs4 and re, to automate data extraction, transformation, and integration into WRIs Global Power Plant Database. Web scraped over 40,000 data points on CO2 emissions at the power plant level from multiple countries.

Sample Jupyter Notebooks included:

  • Australia (australia_dataset_parsing.ipynb)
  • European Union (JRC_Power_Plants.ipynb)

Countries scraped but not included in sample:

  • United States
  • Canada
  • India

Machine Learning Models

co2_prediction_models.ipynb: Machine learning algorithms to predict CO2 emissions for all thermal power plants world- wide using the data previously extracted. Using a min-max scaled dataset, achieved a coefficient of determination of 0.975 and mean squared error of 0.000171 on the test set.

About

Sample code from the work I produced as an intern at WRI in 2019 including web scraping and predictive modeling. The full codebase will be available open-source soon.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published