Skip to content

HariShanmugavelu/MachineLearning_AppliedStatistics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

MachineLearning_AppliedStatistics

Imported the necessary libraries

Read the data as a data frame

Performed basic EDA which included the following and printed out the insights at every step.

a. Shape of the data

b. Data type of each attribute

c. Checking the presence of missing values

d. 5 point summary of numerical attributes

e. Distribution of ‘bmi’, ‘age’ and ‘charges’ columns.

f. Measure of skewness of ‘bmi’, ‘age’ and ‘charges’ columns

g. Checking the presence of outliers in ‘bmi’, ‘age’ and ‘charges columns

h. Distribution of categorical columns (include children)

i. Pair plot that includes all the columns of the data frame

The notebook also analyzed the below questions with the statistical evidence

a. Do charges of people who smoke differ significantly from the people who don't?

b. Does bmi of males differ significantly from that of females?

c. Is the proportion of smokers significantly different in different genders?

d. Is the distribution of bmi across women with no children, one child and two children,the same ?

About

Applied Statistics used for Machine Learning problems

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published