For this project, I used Kaggle’s Red Wine Quality dataset to build various classification models to predict whether a particular red wine is “good quality” or not. Each wine in this dataset is given a “quality” score between 0 and 10. For the purpose of this project, I converted the output to a binary output where each wine is either “good quality” (a score of 7 or higher) or not (a score below 7). The quality of a wine is determined by 11 input variables:
Fixed acidity
Volatile acidity
Citric acid
Residual sugar
Chlorides
Free sulfur dioxide
Total sulfur dioxide
Density
pH
Sulfates
Alcohol
Objectives
The objectives of this project are as follows:
To experiment with different classification methods to see which yields the highest accuracy
To determine which features are the most indicative of a good quality wine
Steps included in this project:
Importing Lib
Loading Data
Understanding Data
Missing Values
Exploring Variables(Data Anylasis)
Feature Selection
Proportion of Good vs Bad Wines
Preparing Data for Modelling
Applying different models
Choosing right model
Hurray you just completed the task !
CHEERS!
Feel free to drop a star if you like it.