Car Accidents Insights

Based on Kaggle's "US Accidents (2016 - 2023)" Dataset

Dataset Overview

The dataset, titled "US_Accidents_March23.csv", contains information about car accidents across the United States up to March 2023. The data includes:

Location details: City, state, and coordinates.
Accident details: Severity, start and end times.
Environmental features: Weather, road conditions, and more.

Objective

The primary goals of this project are:

Clean and preprocess the dataset for analysis.
Sample a subset of 10,000 rows for quick exploration.
Ensure datetime consistency for time-related columns (Start_Time and End_Time).
Perform initial analysis to highlight accident trends.

Steps Undertaken

1️⃣ Data Loading

The dataset was loaded into a pandas DataFrame:

import pandas as pd
file_path = r"US_Accidents_March23.csv"
df = pd.read_csv(file_path)

2️⃣ Cleaning Missing Values

To ensure clean and reliable data:

Rows with missing (NA) values were dropped:

df_no_na = df.dropna()

This step ensures that only complete rows are used for analysis.

3️⃣ Sampling 10,000 Rows

From the cleaned dataset, a random sample of 10,000 rows was selected:

cleaned_accidents = df_no_na.sample(n=10000, random_state=42)

random_state=42 ensures reproducibility of the sample.

4️⃣ Datetime Conversion

Columns Start_Time and End_Time were converted to datetime format:

cleaned_accidents['Start_Time'] = pd.to_datetime(cleaned_accidents['Start_Time'])
cleaned_accidents['End_Time'] = pd.to_datetime(cleaned_accidents['End_Time'])

This step ensures accurate handling of date and time values for trend analysis.

5️⃣ Exporting the Cleaned Sample

The cleaned dataset was saved for further use:

output_path = r"C:\Users\vange\Desktop\github project\cleaned_accidents.csv"
cleaned_accidents.to_csv(output_path, index=False)

The cleaned file is saved at: C:\Users\vange\Desktop\github project\cleaned_accidents.csv

Team Members

This project was a collaborative effort by:

Georgios Birmpakos
Vasileios Katsikas
Evangelos Diaskoufis

Questionnaire Highlights

📝 Key Areas of Focus

A. Accidents Per State

Objective: Identify the states with the highest and lowest accident counts.
Insights: Highlight accident trends across states.

B. Most Prone State for Accidents

Determine the state with the highest accident frequency.
Analyze if there's a significant gap compared to other states.

C. Accidents Per Month

Goal: Examine seasonal trends in accidents.
Identify peak months where accidents are most common.

D. Accidents Near Specific Locations

Analyze accidents near:
- Junctions
- Stop signs
- Traffic signals
Calculate the percentage of total accidents occurring in these areas.

Next Steps

🔍 Exploratory Data Analysis (EDA)

Accident Severity Analysis:
- Explore severity distribution across cities and states.
- Identify the most dangerous areas based on accident severity.
Temporal and Spatial Trends:
- Locate geographical hotspots for accidents.
- Analyze time-based trends:
  - Peak accident hours
  - Seasonal patterns (e.g., months or holidays).

Summary

This documentation outlines the key steps in cleaning and preparing the "US Accidents" dataset for analysis. The cleaned dataset, containing 10,000 sampled rows, is now ready for:

Deeper analysis of accident severity and trends.
Visualization of hotspots, time-based patterns, and critical accident locations.

Stay tuned for further insights and visualizations! 🚗💥

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
2. Data		2. Data
3. Data Cleaning		3. Data Cleaning
4. Analysis		4. Analysis
5. Presentation		5. Presentation
.gitignore		.gitignore
1. Questionnaire.ipynb		1. Questionnaire.ipynb
README.md		README.md
USA Car Accidents Documentation.docx		USA Car Accidents Documentation.docx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Car Accidents Insights

Based on Kaggle's "US Accidents (2016 - 2023)" Dataset

Dataset Overview

Objective

Steps Undertaken

1️⃣ Data Loading

2️⃣ Cleaning Missing Values

3️⃣ Sampling 10,000 Rows

4️⃣ Datetime Conversion

5️⃣ Exporting the Cleaned Sample

Team Members

Questionnaire Highlights

📝 Key Areas of Focus

A. Accidents Per State

B. Most Prone State for Accidents

C. Accidents Per Month

D. Accidents Near Specific Locations

Next Steps

🔍 Exploratory Data Analysis (EDA)

Summary

About

Releases

Packages

Contributors 3

Languages

georgiosbirmpakos/Car-Accidents

Folders and files

Latest commit

History

Repository files navigation

Car Accidents Insights

Based on Kaggle's "US Accidents (2016 - 2023)" Dataset

Dataset Overview

Objective

Steps Undertaken

1️⃣ Data Loading

2️⃣ Cleaning Missing Values

3️⃣ Sampling 10,000 Rows

4️⃣ Datetime Conversion

5️⃣ Exporting the Cleaned Sample

Team Members

Questionnaire Highlights

📝 Key Areas of Focus

A. Accidents Per State

B. Most Prone State for Accidents

C. Accidents Per Month

D. Accidents Near Specific Locations

Next Steps

🔍 Exploratory Data Analysis (EDA)

Summary

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages