- Project Description
- Project Goals
- Requirements
- Deliverables
- Mentoring
- Schedule
- Datasets
- Presentation
- Tips & Tricks
- Proposing Questions
In this project, you will do and present an analysis of your choosing on a topic related to Lisbon, Berlin or Barcelona.
These topics are:
- Transportation
- Urban environment
- Population
- Administration
- Economics & business
Possible datasets you can use are suggested in Dataset
section below.
For additional datasets you can also check Open Data Berlin website and the Lisboa Aberta website.
- Learn to propose interesting questions that can be answered with data.
- Explore/research the data available related to your topic.
- Build a database from the data available.
- Perform a very simple analysis of your data and identify interesting insights.
- Learn something about Barcelona, Lisbon or Berlin!
- You must plan your project. That is why creating a Kanban or Trello Board is mandatory. You have a template for Trello here.
- You CAN'T CODE until you project is planned.
- Create a .gitignore file and include it in your repository.
- Specify questions you would like to answer about your topic.
- Choose data relevant to your questions.
- Design a structure for your database (add the aggregates you think will be useful). You don't need to provide us with the datatypes you will use, but the tables you will create and the relations between them.
- Complete an analysis of your data and provide the most interesting insights.
NO PLOTTING IS ALLOWED IN THIS PROJECT
You must turn in the following before the due date:
- Repo with all of the scripts you used to clean and analyse the data.
- Connection information for the database where you have stored the data.
- Slides for a 10 minute presentation. These must be turned in at least 30 minutes before the time of presentation. See the section on
Presentation
for more information.
One of the TAs or the Lead Teacher will be your mentor! Your mentor will:
- Follow your project in general, will be the second person that knows more about the project, after you.
- Check if you are following the tasks, your blockers, etc
- Help/support you in specific questions.
Your mentor is not meant to:
- Know everything.
- Be your manager.
Tuesday - Wednesday
- Fork the repository.
- Think about questions you could find interesting.
- Do some brainstorming about data you could use to answer your questions.
- Look for more data on the Barcelona Open Data website, the Catalonia Government website, and the Spanish Government website.
- For additional datasets you can also check Open Data Berlin website and the Lisboa Aberta website.
Wednesay
- Define tasks, defining those to be done individually and those ones to be done together.
- Build your database and clean the data.
Thursday
- Analyse your data, identify the most interesting insights, and prepare your presentation.
Friday
- Presentation!
accidents_2017: List of accidents handled by the local police in the city of Barcelona.
air_quality_Nov2017: Mesure data are showed of O3 (tropospheric Ozone), NO2 (Nitrogen dioxide) and PM10 (Suspended particles).
air_stations_Nov2017: Main characteristics of the air quality measure stations of the city of Barcelona.
births: Births by nationalities and by neighbourhoods of the city of Barcelona (2013-2017).
bus_stops: Bus stops, day bus stops, night bus stops, airport bus stops of the city of Barcelona.
deaths: Deaths by quinquennial ages and by neighbourhoods of the city of Barcelona (2015-2017).
immigrants_by_nationality: Immigrants by nationality and by neighbourhoods of the city of Barcelona (2015-2017).
immigrants_emigrants_by_age: Immigrants and emigrants by quinquennial ages and by neighbourhood of the city of Barcelona (2015-2017).
immigrants_emigrants_by_destination: Immigrants and emigrants by place of origin and destination, respectively (2017).
immigrants_emigrants_by_sex: Immigrants and emigrants by sex by neighbourhoods of the city of Barcelona (2013-2017).
most_frequent_baby_names: 25 Most common baby names in Barcelona, disaggregated by sex. Years 1996-2016.
most_frequent_names: 50 Most common names of the inhabitants of Barcelona, disaggregated by decade of birth and sex.
population: Population by neighbourhood, by quinquennial ages and by genre of the city of Barcelona (2013-2017). Reading registers of inhabitants.
transports: Public transports (underground, Renfe, FGC, funicular, cable car, tramcar, etc) of the city of Barcelona.
unemployment: Registered unemployement by neighbourhood and genre in the city of Barcelona (2013-2017).
We recommend you compose your presentation of approximately 8 of the slides below:
- Title of the project
- Team
- Main challenges & strengths
- Data: sources you used, problems, limitations.
- Database: strucutre, new data you added, data you deleted, how you processed the data.
- Main insights & explanations. Give most importance to the most interesting or important questions.
- Questions you were not able to answer and why
- Workflow: what could you have done better? What was useful?
- Main learnings from the project
You will have 10 minutes to present your project. After all the presentations are over, we will have a 5 minutes break and then have a short debate about the projects.
Remember to present your insights as understandably as possible (with NO plots)!
- First the question. After the data.
- Before starting to write code, think about the analysis you would like to do (workflow).
- You will have more questions than answers. Don't worry, even if you can't answer any of your questions. Just show us why you couldn't answer something -- that itself will be interesting!
For example, we are doing a project about the population in Barcelona. We could do, for example, a descriptive question as: How many people there is in my neighbourhood? And you would need to look for the data for that. But you could also to propose an analytic question as: Is my neighbourhood the best? Here you would need to think how could you answer it, the best word is the key. Could mean:
- Safer
- More cultural events
- More schools
- ...
You will need to propose the way you are trying to answer the question. This is very interesting and you will keep doing this for the rest of the bootcamp.