🏡California Housing Price Prediction

The following details and explains performing regression on the California housing dataset using a range of ML models:
Linear Regression, Support Vector Regression, Decision Trees, and Random Forest Regression.

Notebook

Dataset

The dataset contains 20640 entries and 10 variables.

Longitude
Latitude
Housing Median Age
Total Rooms
Total Bedrooms
Population
Households
Median Income
Median House Value
Ocean Proximity

Notebook

In the notebook, I perform:

Data investigation
Data cleaning
Removing outliers
Exploratory data analysis
Feature engineering
Dimensionality reduction
Feature encoding
Correlation and multicolinearity assessment
Feature scaling
Model training (including grid search)

Results

The Random Forest Regression model emerged as the best performer among the trained models, with an average accuracy of $43,658.

R^2 Score: 0.7933309926525507
Mean Absolute Error: 29580.49344298964
Mean Squared Error: 1906039202.1731477
Root Mean Squared Error: 43658.208875000215
Mean Absolute Percentage Error: 17.003087000720146%

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Images		Images
California Housing.ipynb		California Housing.ipynb
LICENSE		LICENSE
README.md		README.md
housing.csv		housing.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏡California Housing Price Prediction

Dataset

Notebook

Results

About

Languages

License

dilne/CaliforniaHousing

Folders and files

Latest commit

History

Repository files navigation

🏡California Housing Price Prediction

Dataset

Notebook

Results

About

Topics

Resources

License

Stars

Watchers

Forks

Languages