kaggle_income

Course kaggle competition to income using US census data. Placed in top 10 out of 50 students.

Jupyter notebook contains EDA and feature engineering. Initially evaluated 5 models (logistic regression, k-nearest neighbors, decision trees, random forest, and gradient boosting) with sklearn. Used grid search for hyperparameter tuning for random forest and gradient boosting. The final model chosen was random forest since it had the highest 10-fold cross-validation accuracy and there was no need to consider a real-time prediction scenario.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
A5-Kaggle.ipynb		A5-Kaggle.ipynb
README.md		README.md
income_test.csv		income_test.csv
income_train.csv		income_train.csv
submission.csv		submission.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kaggle_income

About

Releases

Packages

Languages

hwunrow/kaggle_income

Folders and files

Latest commit

History

Repository files navigation

kaggle_income

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages