Welcome to FraudGuard-ML, a cutting-edge machine learning project designed to detect credit card fraud with precision and flair! ๐ฏ Built by a team of passionate developers from Ulster University, this project leverages data science to secure financial transactions and boost trust in the digital economy. ๐ธ
#Project Overview ๐
Whatโs the Deal? ๐ค
FraudGuard-ML uses the Kaggle Credit Card Fraud Detection dataset (284,807 transactions!) to identify fraudulent activities in real-time. With only 0.172% of transactions being fraud, we tackled the class imbalance head-on using advanced ML techniques. ๐
Tech Stack: ๐ ๏ธ
Python ๐
Libraries: Pandas, Scikit-learn, XGBoost, Seaborn, Matplotlib, Imbalanced-learn
Tools: Google Colab, Kaggle API
Key Features: ๐
Data cleaning and preprocessing (no missing values, duplicates gone! โ
)
Feature engineering with PCA and Random Forest for top-notch insights ๐
Models: Logistic Regression, Random Forest, SVM, and XGBoost (winner with 0.94 AUC-PR! ๐)
Real-time fraud detection simulation ๐ฎ
Stunning visualizations (heatmaps, violin plots, scatter plots) ๐
#How It Works โ๏ธ
Data Import & Cleaning: ๐ฅ
Grabbed the dataset from Kaggle, cleaned it with Pandas, and sampled 10,000 records for efficiency. No dirty data here! ๐งน
Data Wrangling: ๐ง
Scaled features like Time and Amount with StandardScaler, balanced classes with SMOTE. Balanced datasets = happy models! โ๏ธ
Analysis & Visualization: ๐
Plotted class distributions, correlation heatmaps, and feature distributions to uncover fraud patterns. Eye-candy for data lovers! ๐
Modeling: ๐ค
Trained multiple classifiers, with XGBoost shining brightest (F1-Score: 0.86, AUC-PR: 0.94). Confusion matrices? Check! ROC curves? Double check! ๐
Deployment: ๐
Built a real-time prediction function and saved the Random Forest model for deployment. Ready to catch fraudsters in action! ๐ต๏ธโโ๏ธ
#Results & Impact ๐
Performance: ๐
XGBoost nailed it with a 93.6% probability on a sample transaction (Transaction ID: 172001, Amount: โฌ149.23). False positives? Minimized! ๐ช
Impact: ๐ก
With global card fraud losses hitting $32.34 billion in 2022, FraudGuard-ML is a step toward safer online banking. Letโs protect those wallets! ๐ก๏ธ
Limitations: โ ๏ธ
Class imbalance, feature interpretability (thanks, PCA!), and overfitting risks are noted. Future work: fairness audits and real-world testing! ๐