Classification task – Predict ‘Rings’ from other features. Rings indicate the age of the fish. This dataset contains information about physical measurements of abalone for predicting 'Rings' from other features. This project classifies the data with KNN classifier.
The dataset taken from link.
- Google Colab
- Pandas
- Numpy
- Matplotlib
- Seaborn
- Scikit-learn
- Improving on KNN using method of weighted KNN
- Ablation study on Normalization
-
Loading the dataset and splitting into Train-Test set
-
Normalization and training
-
Running the model using its default configuration on the test data
Looping over values of K. Finding best accuracy and K value
Plotting the graph of K values against Accuracies -
Improving KNN using weighted KNN method using 3 schemas - Default, Manhattan and Euclidean
Plotting the graph of K vs Accuracy for all the 3 configurations with normalization -
Ablation Study on Normalization
Performing Ablation study by removing the normalization step from the pipeline of preprocessing
Plot the above accuracies against the K values for all the three configurations (Without normalization)
- Effect of normalization/ Standardization - The normalizarion does not make the classifier reach high accuracy for any of the tested values of k. This is applicable to both uniform KNN and weighted KNN.
- Without the normalization, as k increases and the neighbourhood size inresases, the performance lowers. This is not observed in the case of normalized data.
- For the differenct weighting schemes, the performance is not very different.
- The accuracy is overall low (below 30%). Hence it can be said that more complex models might be needed to classify well in this domain.