My project for Big Data Systems, where I proposed to implement preprocessing, data analysis, request handling and a website integration of the topic: Education to Air Pollution.
The conclusion that I developed after this project, is that air pollution is definitely correlated to education, however - the bigger issue at hand, is in fact the overall development of a country - not strictly education. Stating, that air quality is bad, because people are not educated means pointing fingers at a general population, who are - generaly speaking, a consequence of a general state of development in a country.
Consider images in web/images folder to see the correlation values between education metrics and air pollution.
Tools used: Spark (PySpark), HBase (HappyBase), Seaborn, Scripy, MatPlotLib, Flask (with flask_cors, logging), Leaflet. Languages: Python, JavaScript, HTML, CSS.