Skip to content

apurvsinghgautam/Hacker-Forum-Exploit-and-Classification-for-Proactive-Cyber-Threat-Intelligence

Repository files navigation

Hacker Forum Exploit and Classification for Proactive Cyber Threat Intelligence

Binder

This is a research project that utilizes hacker forum data for proactive cyber threat intelligence. This research paper employs state-of-the-art machine learning and deep learning approach to automatically classify hacker forum data into predefined categories and develop interactive visualizations enabling CTI practitioners to explore collected data for proactive and timely CTI. The results from this research shows that among all the models, deep learning model RNN GRU gives the best classification results with 99.025% accuracy and 96.56% precision.

Update - The high accuracy and precision scores was because of the overfitting of the data. The updated code is run and the results shows that RNN GRU still gives the best classification results but with reasonable accuracy of 98.8% and precision of 96.6%. On the other hand, the ML models, SVM shows accuracy of 97.3% and precision of 76.4% whereas Random Forest shows accuracy of 99.6% and precision of 96.7%.

Tools & Libraries Used

  • Python 3.7
  • Tensorflow
  • Anaconda
  • Sklearn
  • Pandas
  • Keras
  • Seaborn
  • Numpy
  • NLTK

Word Vector Used

Dataset Used

Data Visualizations

Label Classification

ML Models Scores Visualization

DL Models Scores Visualization

DL Model Accuracy Scores Visualizations

DL Model Precision Scores Visualizations

Published Research

Contributors

Releases

No releases published

Packages

No packages published