SrijanShovit · SrijanShovit · Jun 2, 2024 · May 25, 2024 · May 25, 2024 · May 25, 2024
diff --git a/Brain Tumor MRI Classification/Cross Validation Techniques.md b/Brain Tumor MRI Classification/Cross Validation Techniques.md
@@ -0,0 +1,28 @@
+# Cross Validation Techniques for Brain Tumor MRI Classification
+____
+
+The dataset on which the cross validation is carried out can be found [on kaggle](https://www.kaggle.com/datasets/theiturhs/brain-tumor-mri-classification-dataset/data). Find the implementation of this CrossValidation_Techniques.ipynb notebook.
+
+### Cross Validation Teachniques carried out are as follows:
+*Different techniques for distributing dataset into training and testing dataset*
+* Hold-Out CV
+* K-Fold Cross Validation
+* Repeated K-Fold
+* Leave-One-Out (LOO)
+* Stratified K-Fold
+
+### Different techniques and their obtained training accuracies¶
+| Technique	| Average Accuracy across all classes |	Pituitary |	No Tumor | Meningioma | Glioma |
+| ---- | ---- | ---- | ---- | ---- | ---- |
+| Hold-Out CV |	0.9380 |	0.9860 |	0.9760 |	0.8210 |	0.9690 |
+| K-Fold CV	| 0.9604	| 0.9858 |	0.9858 |	0.9196 | 	0.9420|
+| Repeated K-Fold CV	| 0.9585 |	0.9504 |	0.9838 |	0.9255 |	0.9703|
+| Stratified K-Fold CV	| 0.9548 |	0.9544	| 0.9838 |	0.9096 |	0.9667|
+
+**Out of these techniques, K-Fold CV technique gives better overall accuracy and class wise accuracy. Therefore, this technique can be used to divide our dataset into training and testing set.**
+
+1. Hold Out Technique: Randomly the dataset was distributed into 80% training set and 20% validation set. The aacuracy achieved was 93.80%.
+2. K-Fold Cross Validation: This divides data into k equal-sized folds, trains the model k times, each time using k-1 folds as training data and one fold as validation data. The accuracy achieved was 96.04%.
+3. Repeated K-Fold: This repeats the k-fold CV process multiple times with different random splits of the data. It achieved 95.85% accuracy.
+4. Leave One Out: It is a special case of k-fold CV where k is equal to the number of samples in the dataset. Each sample is used as a validation set once. So this was not implemented. It was computationally expensive, would have required a lot of training time.
+5. Stratified K-Fold: It is like k-fold CV, but ensures that each fold preserves the percentage of samples for each class. It achieved an accuracy of 95.48%.
diff --git a/Brain Tumor MRI Classification/CrossValidation_Techniques.ipynb b/Brain Tumor MRI Classification/CrossValidation_Techniques.ipynb
diff --git a/Brain Tumor MRI Classification/Hyper-Parameter Tuning.md b/Brain Tumor MRI Classification/Hyper-Parameter Tuning.md
@@ -0,0 +1,101 @@
+# Hyper-Parameter Tuning Techniques for Brain Tumor MRI Classification
+
+The dataset on which the cross validation is carried out can be found [on kaggle](https://www.kaggle.com/datasets/theiturhs/brain-tumor-mri-classification-dataset/data). Find the implementation of this in Hyper-parameter_Tuning.ipynb notebook. Learner module, FastAI, is considered to provide a convenient way to create and fine-tune convolutional neural network (CNN) models. vision.learner is a function that helps us to construct a learner object, which has the model architecture, data, training configuration, and other elements. We can specify a pre-trained model architecture and fine-tune it on the dataset.
+
+### Hyper-Parameter Tuning Teachniques carried out are as follows:
+*Different techniques to find out suitable hyper-parameters*
+* Random Search optimization algorithm
+* Hyperparameter Optimization with Optuna's Successive Halving Pruner
+
+### Different techniques and their obtained training accuracies¶
+| Technique | Accuracy Score | Architecture | Weight Decay | Epochs | Batch Size | Drop |
+| -- | -- | -- | -- | -- | -- | -- |
+| Random Search optimization algorithm - Run 1 | 95.27% | ResNet50 | 5.527e-5 | 7 | 64 | 0.4 |
+| Random Search optimization algorithm - Run 2 | 95.53% | ResNet50 | 4e-6 | 5 | 64 | 0.2 |
+| Hyperparameter Optimization with Optuna's Successive Halving Pruner | 98.51% | ResNet18 | 0.004016 | 13 | 32 | 0.2680 |
+
+**Out of these techniques,Hyperparameter Optimization with Optuna's Successive Halving Pruner technique gives better overall accuracy. It almost take 40-50 minutes for each to fine-tune the model.**
+
+NOTE: Since here we need to find the best technique that we can use for hyper-parameter tuning of our dataset, and it takes almost 40-50 minutes on an average to get results from each techniques (where we have only considered training dataset not augmented data), thus RandomSplitting is implemented to divide the dataset.
+
+#### Random Search Optimizing Algorithm - Run 1
+
+For n_trails = 10, the accuracy score and best hyper-parameter are as follows:
+
+| Trail No. | Best Score | Architecture | Weight Decay | Epochs | Batch Size | Drop |
+|-----------|------------|--------------|--------------|--------|------------|------|
+| 0         | 0.9011     | ResNet34     | 0.00024      | 8      | 64         | 0.4  |
+| 1         | 0.9413     | ResNet18     | 0.0090       | 15     | 64         | 0.2  |
+| 2         | 0.9343     | ResNet18     | 0.0065       | 5      | 32         | 0.4  |
+| 3         | 0.9080     | ResNet34     | 0.00062      | 5      | 32         | 0.4  |
+| 4         | 0.9019     | ResNet34     | 0.00092      | 6      | 64         | 0.2  |
+| **5**         | **0.9527**     | **ResNet50**     | **0.00005**      | **7**      | **64**         | **0.4**  |
+| 6         | 0.9220     | ResNet34     | 0.00895      | 15     | 64         | 0.2  |
+| 7         | 0.9404     | ResNet18     | 0.0035       | 11     | 64         | 0.4  |
+| 8         | 0.9212     | ResNet18     | 0.0002       | 5      | 64         | 0.4  |
+| 9         | 0.9203     | ResNet34     | 0.0007       | 13     | 32         | 0.4  |
+
+The best accuracy score is 0.9527 with these hyperparameters:
+
+- Architecture: ResNet 50
+- Weight Decay: 5.527e-5
+- Epochs: 7
+- Batch Size: 64
+- Drop: 0.4
+
+#### Random Search Optimizing Algorithm - Run 2
+
+For n_trails = 10, the accuracy score and best hyper-parameter are as follows:
+
+| Trial | Best Score | Architecture | Weight Decay | Epochs | Batch Size | Drop |
+|-------|------------|--------------|--------------|--------|------------|------|
+| 0     | 0.9177     | ResNet34     | 0.000248     | 6      | 32         | 0.2  |
+| 1     | 0.9492     | ResNet50     | 0.002137     | 7      | 64         | 0.4  |
+| 2     | 0.9518     | ResNet50     | 0.000004     | 8      | 64         | 0.2  |
+| 3     | 0.9046     | ResNet34     | 0.000269     | 6      | 64         | 0.2  |
+| 4     | 0.9378     | ResNet50     | 0.000058     | 6      | 32         | 0.2  |
+| 5     | 0.9352     | ResNet18     | 0.000154     | 7      | 32         | 0.2  |
+| 6     | 0.9063     | ResNet34     | 0.000006     | 5      | 64         | 0.4  |
+| **7**     | **0.9553**     | **ResNet50**     | **0.000004**     | **13**     | **64**         | **0.2**  |
+| 8     | 0.9238     | ResNet34     | 0.000824     | 13     | 64         | 0.4  |
+| 9     | 0.9361     | ResNet18     | 0.000018     | 8      | 32         | 0.2  |
+
+The best accuracy score is 0.9553 with these hyperparameters:
+
+- Architecture: ResNet 50
+- Weight Decay: 4e-6
+- Epochs: 5
+- Batch Size: 64
+- Drop: 0.2
+
+#### Hyperparameter Optimization with Optuna's Successive Halving Pruner
+
+For n_trails = 10, accuracy scores and hyper-paramters are:
+
+| Trial | Best Score | Architecture | Weight Decay | Epochs | Batch Size | Drop               |
+|-------|------------|--------------|--------------|--------|------------|--------------------|
+| 0     | 0.9623     | resnet50     | 2.057e-06    | 8      | 32         | 0.2888             |
+| 1     | 0.9518     | resnet50     | 0.001421     | 7      | 32         | 0.2707             |
+| 2     | 0.9807     | resnet34     | 3.744e-06    | 9      | 32         | 0.3918             |
+| 3     | 0.9641     | resnet18     | 2.667e-05    | 7      | 64         | 0.3196             |
+| 4     | 0.9711     | resnet34     | 0.005883     | 8      | 64         | 0.2000             |
+| 5     | 0.9650     | resnet18     | 5.694e-06    | 6      | 64         | 0.2266             |
+| 6     | 0.9737     | resnet18     | 1.813e-05    | 6      | 64         | 0.3732             |
+| **7**     | **0.9851**     | **resnet34**     | **0.004016**     | **13**     | **32**         | **0.2680**             |
+| 8     | 0.9667     | resnet50     | 0.008095     | 15     | 32         | 0.3013             |
+| 9     | 0.9632     | resnet50     | 0.0006995    | 6      | 32         | 0.3595             |
+
+The best accuracy score is 0.9851 with these hyperparameters:
+
+- Architecture: ResNet 34
+- Weight Decay: 0.004016
+- Epochs: 13
+- Batch Size: 32
+- Drop: 0.268
+
+##### Summarizing
+
+1. Random search optimization gives almost 95% accuracy when we ran it two times each with 10 iterations.
+2. Hyperparameter Optimization with Optuna's Successive Halving Pruner, we are getting 98.51% which is a remarkable accuracy.
+
+**So we will be using k-fold validation technique followed by Hyperparameter Optimization with Optuna's Successive Halving Pruner for getting the appropriate hyper-parameters.**
diff --git a/Brain Tumor MRI Classification/HyperParameter_Tuning.ipynb b/Brain Tumor MRI Classification/HyperParameter_Tuning.ipynb