The "Linear Regression in Machine Learning using Python and Sklearn" article's source code
This project aims to introduce beginners to ML using linear regression and the California Housing Prices Dataset. We will train and evaluate two linear regression models using the scikit-learn library to understand the underlying patterns in the data and predict house prices.
The dataset includes these features:
- MedInc: median income (dollars) in a block group
- HouseAge: median house age in years within a block group
- AveRooms: average number of rooms per household in a block group
- AveBedrms: average number of bedrooms per household in a block group
- Population: total population for the entire block group
- AveOccup: average number of household members within a block group
- Latitude, Longitude: latitudinal and longitudinal coordinate of the block group's centroid
The multiple linear regression model achieved an MAE of 0.53, RMSE of 0.7, and R2 score of 0.59. This indicates that our model is able to predict house prices with a good degree of accuracy for a linear model.