Skip to content

Latest commit

 

History

History
34 lines (28 loc) · 1.49 KB

README.md

File metadata and controls

34 lines (28 loc) · 1.49 KB

Final Project for Linear Regression

Contributors:

  • Wayne Lam
  • Omar Hussain

Data Sources:

EDA and Data Visualizations:

  • Relevent Features: Alcohol Interactions Scatterplots Alcohol Plots1 Alcohol Plots2
  • Less Relevent Features: Certain Happiness Metrics Suicide Rate vs. Life Ladder Suicide Rate vs. GDP/Capita
  • World Suicide Rates (mean over 2005-2016) World Suicide Rates

Feature Engineering:

  • Log GDP/Capita (already in UN WHR) and Log Suicide Rate
  • Feature Interactions with R^2 > 0.1 (13 out of 24 are alcohol dependency related)

Feature Selection and Model Selection:

  • Wrapper Method (Recursive Feature Selection) to select features
  • Embedded Method (Lasso Regression) to select features
  • GridSearch for hyperparameter tuning for regularized regressions
  • Ridge Regression generally had the highest R^2 score and lowest RMSE: 0.628, 4.211 respectively with mean 10.459, std 7.274