This document provides an overview of the steps involved in evaluating machine learning models for anomaly detection in network traffic. The following sections detail each phase of the model evaluation process.
The dataset is loaded, and a subset (10%) of the data is sampled to expedite testing. Features and labels are identified, with labels combined into a single column for classification. Classes with insufficient samples are removed to ensure adequate data for model training and evaluation.
The most recent runtime of this code was recorded as 43 seconds.
This step analyzes the distribution of labels in the dataset. It categorizes labels into small, medium, and large groups based on their frequency and visualizes this distribution with horizontal bar charts. Additionally, it compares the percentage of benign versus attack instances.
Output: Bar charts showing label distribution and the percentage of benign vs. attack instances.
The most recent runtime of this code was recorded as 4 seconds.
This step filters attack samples from the dataset and creates separate files for each attack type. It ensures each attack file contains a balanced mix of attack and benign samples, with 30% of the samples being attacks and 70% being benign. The benign samples are randomly selected to match the required ratio.
Output: Separate CSV files for each attack type, saved in the ./attacks/ directory.
The most recent runtime of this code was recorded as 27 seconds.
Feature importance is analyzed specifically for attack files using the Random Forest model. This involves calculating the importance of each feature and visualizing these importances to understand which features are most significant for detecting attacks.
The most recent runtime of this code was recorded as 61 seconds.
Feature importance is assessed across the entire dataset, including both attack and normal traffic. This analysis helps in identifying the most influential features for overall model performance.
The most recent runtime of this code was recorded as 306 seconds.
This step assesses the performance of a RandomForestClassifier model for multi-label classification. The dataset is split into training and testing sets, with added noise to simulate real-world conditions. The model is trained and evaluated, with accuracy metrics, classification reports, and confusion matrices generated for each label. Class distribution and sample counts are also reported.
The most recent runtime of this code was recorded as 504 seconds.
Models are trained and evaluated specifically on attack files to assess their performance. This includes calculating and comparing accuracy, precision, recall, and other metrics to determine how well the models detect different attack types.
The most recent runtime of this code was recorded as 272 seconds.
Machine learning models are trained using a selected set of features from the dataset. This step includes evaluating the models' performance with these features and comparing the results to assess their effectiveness.
The most recent runtime of this code was recorded as 176 seconds.
A comparison of different models based on various performance measures, including accuracy, precision, recall, and AUC, is conducted. This step provides a comprehensive view of how each model performs in terms of detecting anomalies.
The most recent runtime of this code was recorded as 17 seconds.
The final implementation involves training and evaluating the selected machine learning models using the complete dataset. Performance metrics are calculated, and results are analyzed to determine the best model for anomaly detection.
The most recent runtime of this code was recorded as 556 seconds.
Graphs comparing the performance of different models are generated, including ROC curves and bar plots for accuracy, precision, and recall. These visualizations help in understanding the relative performance of each model.
The most recent runtime of this code was recorded as 50 seconds.
Heat maps are used to visualize confusion matrices, providing a clear view of how each model performs in terms of true positives, false positives, true negatives, and false negatives.
The most recent runtime of this code was recorded as 1 second.
Performance metrics, including accuracy, precision, and recall, are computed for each model and displayed in tabular and graphical formats. This helps in comparing the effectiveness of different models in detecting anomalies.
The most recent runtime of this code was recorded as 66 seconds.
Confusion matrices for each model are generated to evaluate the performance in classifying different attack types and normal traffic. These matrices provide detailed information on the number of true and false classifications.
The most recent runtime of this code was recorded as 89 seconds.
The importance of features is visualized using a bar graph, showing how each feature contributes to the performance of the Random Forest model. This helps in identifying key features that influence model predictions.
The most recent runtime of this code was recorded as 23 seconds.