This is Ironhack's mid-bootcamp project, developed and (not-yet)completed by the knowledge triumvirate of Gladys, izzy & JC.
Ironhack storytelling: You are working as an analyst for a real estate company. Your company wants to build a machine learning model to predict the selling prices of houses based on a variety of features on which the value of the house is evaluated
Ironhack objectives:
- Build a model that will predict the price of a house based on features provided in the dataset.
- Use business intelligence tools to explore the characteristics of the houses.
- To know which factors are responsible for higher property value - $650K and above.
Group objectives:
- Be able to organize and divide the work equally accordingly to each other skills.
- Document the process and keep everyone updated.
- Merging all files and discussing the changes together.
- Learn about AGILE methodology.
- Work on personal weak-points and try to learn from each other.
- Have fun ! :)
Project deadline: 06 days (between 23/04/23 and 09/05/23)
Week 12:
- DAY 1
(25-04-2023)
| Project discussion, tasks assignments** and division of the work. - DAY 2
(27-04-2023)
| Starting with Trello, merging python scripts, starting to work on SQL and storytelling-brainstorming. - DAY 3
(29-04-2023)
| Futher improvments to the code and opening discussions, more task assigments, SQL part done. Also;- We discussed about making “house_lifetime” based on the last year of the dataset, instead of the current year (2023).
- We discussed about making year a continuous variable, (e.g, 2013,02 to represent february) to have a a single feature that represents yearly trends.
Week 13:
- DAY 4:
(02-05-2023)
| Presentation of the changes, brainstorming on how to improve the model, tasks division (Tableau, presentation, python fine-tunning) - DAY 5:
(04-05-2023)
| Tasks division. Finishing with Tableau, presentation and python fine-tunning. - DAY 6:
(06-05-2023)
| Presentation day
- 00_data --> data and datasets info.
- 01_usefulness --> tailor's drawer to quickly access to functions, libraries and a template.
- 02_project_info --> ironhack deliverables files.
- 03_python_scripts --> python source code.
- 04_sql_script --> sql source script.
- 05_sandboxes --> testing scripts and ideas.
- 06_tableau --> tableau exercise.
- 07_presentation --> presentation and conclusions.
Enviornments
- JupyterLab: Python scripts.
- MySQL Workbench: SQL script.
- Tableau: Visualizations.
- Trello: Organization.
- Google Doc: Organization.
- Canva: Logo and presentation.
Libraries
- Pandas: Data manipulation.
- Os: File managment.
- Warnings: Roses are red. Violets are blue. Warnings are annoying.
- Datetime: To play with time.
- Matplotlib: 2D visualizations.
- Seaborn: High-resolution visualizations.
- Linear Regression model: From sklearn.
- Skew: Data asymmetry.
- StandardScaler: Data normalization.
- Train-test splits: Sets after X-Y split.
- Metrics: R2, RMSE, MSE, MAE.