Data-Analytics-Project

Credit Card Fraud Detection

Combination of Unsupervised and Supervised Technique in Credit Card Fraud Detection

This is a mini project at NITK Surathkal.

Dataset-

Dataset is collected from Kaggle named creditcard.csv. This is a credit card dataset made by European cardholders in September 2013. The dataset is extremely unbalanced, the actual classes are only 0.172% of all transactions. It holds only numeric variables there is no object variable is there in the dataset. All the numeric variables are taken after PCA transformation. link to dataset- https://www.kaggle.com/mlg-ulb/creditcardfraud

Idea-

This project proposes the implementation of a hybrid approach that makes use of unsupervised outlier scores to extend the feature set of a fraud detection classifier.We used outlier scores from the unsupervised outlier detection models and added them as the new features and fed the data to the classifier.We compared various classifiers and classifier models with outlier scores as features.In the classifiers Random forest classifier and SVC performed well with respect to accuracy,recall and F1 score.So,we selected them as base model for the further comparison for outlier scores.So,after adding them ,we compared SVC+outlier score and RFC+outlier score, CBLOF(cluster-based local outlier factor) outlier score increased the accuracy of both the models i.e,SVC and RFC. While we couldn’t attain our aim of 100% efficiency in fraud discovery, we made end up building a method that can, with sufficient time and data, get very near to that aim. As with any such project, there is some scope for advancement here. The very characteristics of this project admit for various algorithms to be combined as modules and their effects can be merged to improve the correctness of the final result.

A simple flow of the project-

Results only with supervised algorithms-

Results with Sampling Algorithms (SMOTE)-

Results with combination of all (Unsupervised + SMOTE + Supervised)-

How to setup the project

This project is built using Python3+ on jupyter-notebook. All the required libraries are listed in the first part of every code file. Make sure to install all of them to run the project smoothly.

Clone the project

$ git clone https://github.com/Akbhobhiya/Data-Analytics-Project.git
$ cd Data-Analytics-Project

Contributer

Ashok Bhobhiya

Wish to Contributing
Please feel free to send a pull request or create an issue if you find any.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Data-Analytics-Project

Credit Card Fraud Detection

Combination of Unsupervised and Supervised Technique in Credit Card Fraud Detection

Dataset-

Idea-

A simple flow of the project-

Results only with supervised algorithms-

Results with Sampling Algorithms (SMOTE)-

Results with combination of all (Unsupervised + SMOTE + Supervised)-

How to setup the project

Clone the project

Contributer

Files

README.md

Latest commit

History

README.md

File metadata and controls

Data-Analytics-Project

Credit Card Fraud Detection

Combination of Unsupervised and Supervised Technique in Credit Card Fraud Detection

Dataset-

Idea-

A simple flow of the project-

Results only with supervised algorithms-

Results with Sampling Algorithms (SMOTE)-

Results with combination of all (Unsupervised + SMOTE + Supervised)-

How to setup the project

Clone the project

Contributer