Credit Spending Habits Visualization π π₯
Credit Spending in India using Tableau and Streamlit π³
In this project, we will combine and practice implementing what we have learned throughout this course, including:
-
SQL queries using pgadmin
-
Cleaning data using python
-
Creating an interactive dashboard with tableau
-
Apply your bootcamp learnings into a single end-to-end project: Data retrieval, EDA / cleaning, Statistical modeling (optional), Tableau
-
Main deliverable: Tableau dashboard(s) and presentation, Jupyter notebooks if used
βββ data <- Data Folder
β βββ cleaned_data_india_credit.csv <- Cleaned data source
β βββ credit_card_india.csv <- Oringinal data source
β
βββ output <- Images Used in the Project and output files
β βββ Credit Dash.png <- dashboard snapshot
β βββ sql_output.md <- output of sql queries
β βββ ERD_diagram.png <- Entity relationship diagram
β
βββ src/modules <- Source Code
β βββ cleaning_data.ipynb <- code for data cleaning
β βββ SQL_Credit_card.ipynb <- code for sql queries
β βββ EDA.ipynb <- EDA file
β βββ credit-card-spending-ml.ipynb <-machine learning files
β βββ Credit_viz.twbx <- tableau file with dashboard
β βββ streamlit <- Folder For All The Streamlit App Code
β βββ app.py <- viz app for different plots
β βββ eda_app.py <- pandas profiling
β βββ requirements.txt<- list of all dependencies
β
βββ __init__.py <- Package Initializer
βββ README.md <- Project Documentation
- Dataset - https://www.kaggle.com/datasets/thedevastator/analyzing-credit-card-spending-habits-in-india
-
-
-
Step 1: Aquire dataset and import into jupyter notebook, clean dataset and export file
-
Step 2: Use cleaned file and import into tableau and SQL
-
Step 3: Run sql queries to answer questions like top 5: Cities with fraud, which gender has most fraud, what credit card had most fraud
-
Step 4: Make an interactive dashboard based on city and date on tableau
- Please refer to the following:
- SQL Queries MD File
- ERD Diagram
- This gif showcases the pandas profiling module deployed on the streamlit that allows you to do EDA by uploading a dataset
- This allows you to do EDA with any dataset uploaded, it will create various plots for you to conduct analysis, also deployed on streamlit
- Test out the app here: EDA app deployment on streamlit cloud
- Note: the app does not use the uploaded dataset for some reason, it is supposed to do what is in the gif above, but you can use it with local host
To run the Streamlit App, run the following command:
streamlit run app.py
- Note for the above, you need to be in the correct folder
https://docs.google.com/presentation/d/1zzXzLE6kJSKPbSglUbs9xWFQATFGJms0aH-3kAz8Uek/edit#slide=id.p
https://public.tableau.com/app/profile/gurmol.sohi/vizzeshttps://public.tableau.com/app/profile/gurmol.sohi/viz/Credit_viz/Dashboard1
- If you click on the city, it will adjust all the other graphs, you can also select specific months to gain insight
- Mumbai and Bengaluru had greatest spending,
- Most spending was on bills, food, fuel
- Highest percentage of gold type was in Zira
- Lowest gold card type was in Achalpur
- The highest spend month was in August with a Platinum card
- Greater Mumbai had the highest expense type with Bills and lowest with entertainment
- Greater Mumbai had 14% of the total spending of the whole dataset
- 01/2015 had the highest amount of spending
- It was a challenge to create interactivty on tableau
- had lots of issues creating a db with sql-lite to begin with and running queries
- importing the file from a different folder into cleaning ipynb caused some issues
- streamlit was not working at first, issues with pandas profiling library
- adding a state column in excel online to make the tableau dashboard better
- develop a ML model and make predictions on credit fraud
- compare to other datasets and other spending habits in different countries
- customer segmentation, credit risk, credit fraud detection (anomalies), credit approval projects coming soon