This Jupyter Notebook contains an analysis using Pandas DataFrames of a police dataset containing information from January 2005 to December 2012.
-
Data Loading: The dataset is loaded from a CSV file using
pd.read_csv()
. -
Data Cleaning: The column 'country_name' is removed as it contains only missing values.
-
Speeding Analysis: The number of men and women stopped for speeding is compared using
value_counts()
. -
Search Analysis: The relationship between gender and search conducted during a stop is analyzed using
groupby()
andsum()
. -
Stop Duration Analysis: The mean stop duration is calculated after mapping the duration categories to numerical values.
-
Age Distribution Analysis: The age distributions for each violation are compared using
groupby()
anddescribe()
.
- Pandas
- NumPy
- Make sure to have the Police Data.csv file in the specified path.
- Run the notebook cells sequentially to perform the analysis.