Certainty-Attacks

This repo shows the code for our paper "Certainty attacks using explainability preprocessing". We show that metrics such as certainty that are often used for adversarial attack detection can be easily manipulated by incorporating even simple XAI techniques.

In each file you will find a small description of what is the output of the file, for the functions_file.py the functions concerning the new strategy have as well a small description of what they do.

First read the functions_file.py to get an understanding of all the functions and transformations needed for the new strategy, the models, and the remaining strategies
Read the success_confidence_deepf_newstg.py file and generate a small test, this is the first file you always needs to run in all tests since this produces the correct_predictions file that is used in the transferability and quality of images tests
Run afterwards the file success_confidence_fgsm_bim to produce the results of these strategies
Run the two tests about quality
Read the transferability_all.py file and run the test of transferability after you have loaded all test models that you will use as a target and write this down accordingly in the .py file under the variable models_to_test for each data set that you have

In the folder results you will find the generated csv from the .py files (except transferability) doing a small test using the following parameters:

FOR success_confidence_deepf_newstg (use same parameters in the other files as well) model to load with .pt extension: Resnet18_model_JAFFE_two.pt how many images to test: 10 how many neighbors for LIME: 1 how much decay factor for new strategy: 1 Name of dataset, can be JAFFE or FER: JAFFE

For displaying the results of the experiments: success of each strategy is calculated as sum(label_strategy != real_label)/count(correctly classified images) image quality of each strategy: I just took the mean value of each distance metric confidence: histogram with confidence of adversaries that did fool the models transferability: the mini code

x = pd.read_csv('transferability_from_Resnet18_model_two_nb_strat3.csv')
transfer_strat = list(x.columns[8:])
dict_transfers = {}
for i in transfer_strat:
    dict_transfers[i] = len(x[x[i] != x['real_label']])/len(x)

print(dict_transfers)

Visualization of the Attack Strategy

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
Resnet18_JAFFE.py		Resnet18_JAFFE.py
cal_imagequali.py		cal_imagequali.py
cal_success.py		cal_success.py
cal_trans_success.py		cal_trans_success.py
cifar10.py		cifar10.py
functions_file.py		functions_file.py
mnist.py		mnist.py
quality_imgs_all_methods.py		quality_imgs_all_methods.py
requirements.txt		requirements.txt
strategy_example.jpg		strategy_example.jpg
success_confidence_deepf_newstg.py		success_confidence_deepf_newstg.py
success_confidence_fgsm_bim.py		success_confidence_fgsm_bim.py
transferability_all.py		transferability_all.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Certainty-Attacks

Visualization of the Attack Strategy

About

Uh oh!

Releases

Packages

Uh oh!

Languages

KDD-OpenSource/Certainty-Attacks

Folders and files

Latest commit

History

Repository files navigation

Certainty-Attacks

Visualization of the Attack Strategy

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages