Skip to content

Latest commit

 

History

History

PCA+KNN_on_CIFAR10

CIFAR-10 Dataset Classification w/ KNN + PCA

KNN and PCA Implementation

Iterates over different k and s values for KNN and PCA.

To reproduce results:

conda env create -f cifar10_environment.yml
conda activate cifar10
python cifar10_pca_knn_analysis.py

Q. Do you think that PCA is effective for kNN?

A. Yes! With a very small PCA value the error rate is very similar to performing kNN algorithm with the full data.

Q. Can you identify a value s << n for which the classification error with s-PCA is similar to the classification error on the original data?

A. At s = 20 the classification error is less than 5% away from the classification error on the full images (equal to s = 1024). The drop is substantial from s = 1 (0.9 error = random...) till s = 20. Afterwards remaining error rate difference gradually nears the non-PCA result.

Q. Is the value of this s related to the number of neighbors k or is it independent of it?

A. The value of s is independent of the number of neighbors. Increasing k definitely improves the result, but doesn’t seem to effect the ideal s value.

graph table