C++ implementation of
-
Isconna: Streaming Anomaly Detection with Frequency and Patterns. Rui Liu, Siddharth Bhatia, Bryan Hooi. (Under Review)
If you find dep/AUROC
and dep/mio
are empty, use this command to clone the dependencies.
git submodule update --init
The following steps are necessary before choosing the standalone or the docker build.
-
Open a terminal (Linux/macOS) or a Visual Studio Developer Command Prompt (Windows)
-
cd
to the project rootIsconna
-
mkdir out
-
tar -xf data/data.zip -C data
(Windows)-
Or
unzip data/data.zip -d data
(Linux/macOS) -
Or
7z x data/data.zip -odata
-
If you want to build and run directly on your machine …
-
cmake -DCMAKE_BUILD_TYPE=Release -GNinja -S . -B build/release
-
Remove
-GNinja
if you don’t haveninja
-
-
cmake --build build/release --target Demo
-
build\release\Demo.exe
(Windows) orbuild/release/Demo
(Linux/macOS)
The demo runs Isconna-EO on CIC-IDS2018 (data/CIC-IDS2018/processed/Data.csv
) and prints ROC-AUC.
An empty out/Score.tsv
will be created.
Basically the same as the Linux build, but some steps are different.
-
Decompress
data/data.zip
to WSL’s drive, e.g.,/root/Dataset
-
Set environment variable
DATASET_DIR
, e.g.,export DATASET_DIR=/root/Dataset
See https://docs.microsoft.com/en-us/windows/wsl/compare-versions for the reason.
WSL1 does not have this issue.
If you want to use docker, at the cost of some speed …
-
docker build -t isconna .
-
docker run -v %cd%\data:/Isconna/data -v %cd%\out:/Isconna/out isconna
(Windows)-
On Linux/macOS, use
docker run -v $PWD/data:/Isconna/data -v $PWD/out:/Isconna/out isconna
-
This does the same thing as the standalone method.
The image also includes dependencies for experiments.
Only for the demo (example/Demo.cpp
).
Uncomment the fprintf
in the for loop.
out/Score.txt
has 1 column: the final anomaly score.
Cores are declared within the body. You can search for the variable isc
, then uncomment the desired core, and comment out the others.
Parameters and dataset paths are specified in the Parameter section of the code.
You need to prepare three files:
-
Meta file
-
Only includes an integer
n
, the number of records in the dataset -
Assign its path to
pathMeta
-
E.g.,
data/CIC-IDS2018/processed/Meta.txt
-
-
Data file
-
A header-less csv file with shape
[n,3]
-
Each row includes 3 integers: source, destination and timestamp
-
Timestamps should start from 1 and be continuous
-
Assign its path to
pathData
-
E.g.,
data/CIC-IDS2018/processed/Data.csv
-
-
Label file
-
A header-less text file with shape
[n,1]
-
Each row includes 1 integer: 0 if normal, 1 if anomalous
-
Assign its path to
pathLabel
-
E.g.,
data/CIC-IDS2018/processed/Label.csv
-
If you only use Isconna as a library, you can use a sparse clone.
-
git clone --depth=1 --filter=blob:none --sparse https://github.com/liurui39660/Isconna.git
-
cd Isconna
-
git sparse-checkout set include
To use Isconna in your code, you need to
-
Include headers
include/EdgeNodeCore.hpp
and/orinclude/EdgeOnlyCore.hpp
-
Instantiate cores with required parameters
-
Number of CMS rows
-
Number of CMS columns
-
Decay factor (default is 0, i.e., keep nothing)
-
-
Call
operator()
on individual records, the signature includes-
Source (categorical)
-
Destination (categorical)
-
Timestamp
-
Weight for the frequency score
-
Weight for the width score
-
Weight for the gap score
-
Return value is the anomaly score
-
-
Python: liurui39660/Isconna.Python
Please consider citing our arXiv preprint if you want to use our code for you research.
@misc{liu2021isconna, title={Isconna: Streaming Anomaly Detection with Frequency and Patterns}, author={Rui Liu and Siddharth Bhatia and Bryan Hooi}, year={2021}, eprint={2104.01632}, archivePrefix={arXiv}, primaryClass={cs.LG} }