kaggle_SFCrime

UPDATE: the trained model is deployed into a Flask web app hosted by pythonanywhere.com (http://jypucca.pythonanywhere.com/).

This repository is created for the Kaggle SF crime prediction project. In the file name SF crime(1).ipynb, Bernoulli Naive Bayes method was used - it assumes independence between the features (columns in the dataframe) and that each feature is treated as a binary variable. Therefore, one-hot-encoding is required to format the data. In SF crime(2).ipynb, Xtreme Gradient Boosting (XGB) classifier was used and yielded better results (top 29% LB ranking). XGB is a regularized boosting algorithm in which the weak classifier learns its parameters based on the performance of the previous classifier, giving more weight to previously misclassified samples (or large error in case of regression).

For the XGB classifier, a feature importance graph was generated as part of the XGB library. Feature importance score (F score on x-axis) is a measure of how useful a feature is when it comes to improving decision tree performance within the Ensemble model. Based on this graph, the GPS location, hours, and certain months (June, October) yielded strong predictive powers in determining the probability of a crime category.

Using Tableau, a map of the crime activity is generated as shown (for interactive map, click here ).

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
Flask		Flask
sagemaker		sagemaker
README.md		README.md
SF crime (1).ipynb		SF crime (1).ipynb
SF crime (2).ipynb		SF crime (2).ipynb
crime_location.png		crime_location.png
feature_importance(2).png		feature_importance(2).png
test.csv.zip		test.csv.zip
train.csv.zip		train.csv.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kaggle_SFCrime

UPDATE: the trained model is deployed into a Flask web app hosted by pythonanywhere.com (http://jypucca.pythonanywhere.com/).

About

Releases

Packages

Languages

jyu-theartofml/kaggle_SFCrime

Folders and files

Latest commit

History

Repository files navigation

kaggle_SFCrime

UPDATE: the trained model is deployed into a Flask web app hosted by pythonanywhere.com (http://jypucca.pythonanywhere.com/).

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages