AI-secure
diff --git a/‎README.md
Lines changed: 175 additions & 0 deletions b/‎README.md
Lines changed: 175 additions & 0 deletions
diff --git a/‎attacks.py
Lines changed: 150 additions & 0 deletions b/‎attacks.py
Lines changed: 150 additions & 0 deletions
diff --git a/‎auto_LiRPA/LICENSE
Lines changed: 11 additions & 0 deletions b/‎auto_LiRPA/LICENSE
Lines changed: 11 additions & 0 deletions
@@ -0,0 +1,175 @@
+# CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing
+
+We propose CROP, the ﬁrst uniﬁed framework for certifying robust policies for RL against test-time evasion attacks on agent observations. In particular, we propose two robustness certiﬁcation criteria: *robustness of per-state actions* and *lower bound of cumulative rewards*. We then develop three novel methods (LoAct, GRe, LoRe) to achieve certification corresponding to the two certification criteria. More details can be found in our paper:
+
+*Fan Wu, Linyi Li, Zijian Huang, Yevgeniy Vorobeychik, Ding Zhao*, and *Bo Li*, "CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing", [ICLR 2022](https://openreview.net/forum?id=HOjLHrlZhmx)
+
+All experimental results are available at the website https://crop-leaderboard.github.io.
+
+## Content of the repository
+
+In our paper, we apply our **three** certiﬁcation algorithms (*CROP-LoAct, CROP-GRe*, and *CROP-LoRe*) to certify **nine** RL methods (*StdTrain, GaussAug, AdvTrain, SA-MDP (PGD,CVX), RadialRL, CARRL, NoisyNet*, and *GradDQN*) on two high-dimensional Atari games (*Pong* and *Freeway*), one low dimensional control environment (*CartPole*), and an autonomous driving environment (*Highway*). For all algorithms in all environments, we obtain certification on either per-state action stability or cumulative reward lower bound. 
+
+In this repository, we provide the code for our CROP framework, built on top of the deep RL codebase [CleanRL](https://github.com/vwxyzjn/cleanrl). Basically, our repository contains both the code for 
+
+1. training policies (via functionalities provided by CleanRL codebase), and
+2. certifying the trained policies (via testing code and APIs in our CROP framework).
+
+Below, we first present example commands for running the certifications (including LoAct, GRe, and LoAct), and then provide the usage of the easy-to-use plug-and-play APIs such that interested readers can directly integrate these certification APIs into their own testing code for their trained models.
+
+## Example commands for certification
+
+In this part, we present the example commands for obtaining certification corresponding to two certification criteria via three certification algorithms.
+
+### CROP-LoAct
+
+We first run the pre-processing step to obtain the output range of the Q-network, e.g.,
+
+```bash
+python cleanrl_estimate_q_range.py \
+			--load-checkpoint <model_path> --dqn-type <model_type> \
+			--m 10000 --sigma 0.01
+```
+
+Then, we update the configuration file `config_v_table.py` and run LoAct to obtain the certification for per-state action stability via local smoothing, e.g.,
+
+```bash
+python cleanrl_certify_r.py \
+			--load-checkpoint <model_path> --dqn-type <model_type> \
+			--m 10000 --sigma 0.01
+```
+
+The results are stored in files with the suffix `_certify-r-{i}.pt`.
+
+### CROP-GRe
+
+Example command to run GRe to obtain the certification for cumulative reward via global smoothing:
+
+```bash
+python cleanrl_run_global.py \
+				--gym-id PongNoFrameskip-v4 --restrict-actions 4 \
+				--load-checkpoint <model_path> --dqn-type <model_type> \
+				--max-episodes 10000 --sigma 0.01
+```
+
+The results are stored in the file with the suffix `_global-reward.pt`.
+
+### CROP-LoRe
+
+Example command to run LoRe to obtain the certification for cumulative reward via adaptive search algorithm along with local smoothing: 
+
+```bash
+python cleanrl_certify_r.py \
+				--gym-id PongNoFrameskip-v4 --restrict-actions 4 \
+				--load-checkpoint <model_path> --dqn-type <model_type> \
+				--m 10000 --sigma 0.01
+```
+
+The results are stored in the file with the suffix `_certify-map.pt`.
+
+## Usage of APIs
+
+### class LoAct
+
+- **Filepath**: ``lo_act.py``
+- **Class name**: ``LoAct``
+- **Input variables**:
+
+``log_func``: the function for logging information
+
+``input_shape``: shape of the state observation
+
+``model``: the model (Q-network)
+
+``forward_func``: the model forward function of the given model (e.g., ``model.forward``). This function returns the Q-value.
+
+``m``: number of samples for randomized smoothing
+
+``sigma``: standard deviation of the smoothing Gaussian noise
+
+``v_lo`` and ``v_hi``: the estimated output range of the Q-network. Details see Section 4.2 of the paper.
+
+``conf_bound``: parameter for computing the confidence interval in the Hoeffding's inequality.
+
+- **Functions**:
+
+``__init__``: initialization
+
+``init_record_list``: initialize the statistics to be saved
+
+``forward``: 1) perform randomized smoothing; 2) compute the certification via Theorem 1 in Section 4.1
+
+``save``: save the statistics and reset
+
+* **How to incorporate the API**
+
+1. *Model loading*: after loading the model as in the original testing, wrap the loaded model into ``LoAct``;
+2. *Forwarding*: replace the original forwarding step via the model with the forwarding step via ``LoAct``;
+3. *Statistics saving*: after finishing one episode, save the stored statistics and reset.
+
+* **Example file for proper usage of the API**: ``cleanrl_certify_r.py``
+
+### class GRe
+
+- **Filepath**: ``g_re.py``
+- **Class name**: ``GRe``
+
+- **Input variables**:
+
+``log_func``, ``input_shape``, ``model``, ``forward_func``, ``sigma``: same as described in the previous part for class LoAct
+
+- **Functions**:
+
+``__init__``: initialization
+
+``forward``: perform global smoothing by adding one noise at each given time step
+
+* **How to incorporate the API**
+
+1. *Model loading*: after loading the model as in the original testing, wrap the loaded model into ``GRe``;
+2. *Forwarding*: replace the original forwarding step via the model with the forwarding step via ``GRe``;
+3. *Statistics saving*: after completing ``args.max_episodes`` number of trajectories via ``GRe`` forwarding and obtaining the cumulative rewards for these ``args.max_episodes`` randomized trajectories, save these reward values.
+
+* **Example file for proper usage of the API**: ``cleanrl_run_global.py``
+
+### class LoRe
+
+- **Filepath**: ``lo_re.py``
+- **Class name**: ``LoRe``
+- **Input variables**:
+
+``log_func``, ``input_shape``, ``model``, ``forward_func``, ``m``, ``sigma``, ``v_lo``, ``v_hi``, ``conf_bound``: same as described in the previous part for class LoAct
+
+``max_frames_per_episode``: the trajectory/horizon length to evaluate for the reward certification, i.e., $H$
+
+- **Main functions**:
+
+``__init__``: initialization, including the preparation for priority queue and the memorization in search
+
+``run``: the entire adaptive search algorithm, alternating between the *trajectory exploration and expansion* step and the *perturbation magnitude growth* step in the loop
+
+``expand``: the *trajectory exploration and expansion* step
+
+``take_action``: deciding the possible action set for each current step, via Theorem 4 in Section 5.2
+
+``save``: saving ``certified_map`` which contains the list of mappings from perturbation magnitudes to the corresponding certified lower bounds to the corresponding file
+
+* **How to incorporate the API**
+
+1. *Model loading*: after loading the model as in the original testing, wrap the loaded model into ``LoAct``;
+2. *Adaptive search*: directly call ``lo_re.run(env, obs)``, where ``obs`` is the fixed initial observation;
+
+* **Example file for proper usage of the API**: ``cleanrl_tree.py``
+
+* **Sidenote**: During the growth of the tree, we keep track of the nodes and edges corresponding to the states and transitions. The tree structure can be saved via ``save_tree`` at the end of the adaptive search, which facilitates the visualization of the search tree as well as the understanding of the certification procedure.
+
+## Reference
+
+```tex
+@inproceedings{wu2022crop,
+title={CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing},
+author={Wu, Fan and Li, Linyi and Huang, Zijian and Vorobeychik, Yevgeniy and Zhao, Ding and Li, Bo},
+booktitle={International Conference on Learning Representations},
+year={2022}
+}
+```
@@ -0,0 +1,150 @@
+import torch
+from torch import autograd
+import torch.nn as nn
+import torch.optim as optim
+import torchvision.transforms as transforms
+import torchvision.datasets as datasets
+import torch.nn.functional as F
+import numpy as np
+TARGET_MULT = 10000.0
+
+USE_CUDA = torch.cuda.is_available()
+Variable = lambda *args, **kwargs: autograd.Variable(*args, **kwargs).cuda() if USE_CUDA else autograd.Variable(*args, **kwargs)
+
+CARTPOLE_STD=[0.7322321, 1.0629482, 0.12236707, 0.43851405]
+ACROBOT_STD=[0.36641926, 0.65119815, 0.6835106, 0.67652863, 2.0165246, 3.0202584]
+
+
+def pgd(model, X, y, verbose=False, params={}, env_id="", norm_type='l_2'):
+    X /= 255
+    epsilon = params.get('epsilon', 0.00392)
+    niters = params.get('niters', 10)
+    img_min = params.get('img_min', 0.0)
+    img_max = params.get('img_max', 1.0)
+    network_type = params.get('network_type', 'nature')
+    loss_func = params.get('loss_func', nn.CrossEntropyLoss())
+    step_size = epsilon * 1.0 / niters
+    y = Variable(torch.tensor(y))
+    if verbose:
+        print('epislon: {}, step size: {}, target label: {}'.format(epsilon, step_size, y))
+
+    X_adv = Variable(X.data, requires_grad=True)
+
+    for i in range(niters):
+
+        if network_type == 'noisynet':
+            model.model.sample()
+
+        _, logits = model.forward_requires_grad(X_adv, return_q=True)
+
+        loss = loss_func(logits, y)
+        if verbose:
+            print('current loss: ', loss.data.cpu().numpy())
+        model.zero_grad()
+        loss.backward()
+
+        if norm_type == 'l_inf':
+            eta = step_size * X_adv.grad.data.sign()
+        elif norm_type == 'l_2':
+            if not torch.norm(X_adv.grad).item():
+                eta = step_size * X_adv.grad.data
+            else:
+                eta = step_size * X_adv.grad.data / torch.norm(X_adv.grad).data
+
+        X_adv = Variable(X_adv.data + eta, requires_grad=True)
+        # adjust to be within [-epsilon, epsilon]
+
+        if norm_type == 'l_inf':
+            eta = torch.clamp(X_adv.data - X.data, -epsilon, epsilon)
+
+        elif norm_type == 'l_2':
+            eta = X_adv.data - X.data
+            # print('iter', i, 'second eta', torch.norm(eta))
+            if torch.norm(eta) > epsilon:
+                eta *= epsilon / torch.norm(eta)
+
+        X_adv.data = X.data + eta
+        if verbose:
+            print('max eta: ', np.max(abs(eta.data.cpu().numpy())))
+            print('linf diff before clamp: ', np.max(abs(X_adv.data.cpu().numpy()-X.data.cpu().numpy())))
+
+        X_adv.data = torch.clamp(X_adv.data, img_min, img_max)
+        if verbose:
+            print('linf diff after clamp: ',np.max(abs(X_adv.data.cpu().numpy()-X.data.cpu().numpy())))
+
+    if verbose:
+        print('{} iterations'.format(i+1))
+
+    return torch.clamp((X_adv.data * 255).long(), 0, 255)
+
+
+def fgsm(model, X, y, verbose=False, params={}):
+    epsilon=params.get('epsilon', 1)
+    img_min=params.get('img_min', 0.0)
+    img_max=params.get('img_max', 1.0)
+    X_adv = Variable(X.data, requires_grad=True)
+    logits = model.forward(X_adv)
+    loss = F.nll_loss(logits, y)
+    model.features.zero_grad()
+    loss.backward()
+    eta = epsilon*X_adv.grad.data.sign()
+    X_adv = Variable(X_adv.data + eta, requires_grad=True)
+    X_adv.data = torch.clamp(X_adv.data, img_min, img_max)
+    return X_adv.data
+
+
+
+def rand_attack(model, X, y, verbose=False, params={}, env_id=""):
+    epsilon = params.get('epsilon', 0.00392)
+    if env_id == "CartPole-v0":
+        epsilon = torch.from_numpy(CARTPOLE_STD) * epsilon
+    if env_id == "Acrobot-v1":
+        epsilon = torch.from_numpy(ACROBOT_STD) * epsilon
+    img_min = params.get('img_min', 0.0)
+    img_max = params.get('img_max', 1.0)
+    noise = 2 * epsilon * torch.rand(X.data.size()) - epsilon
+    if USE_CUDA:
+        noise = noise.cuda()
+    X_adv = torch.clamp(X.data + noise, img_min, img_max)
+    X_adv = Variable(X_adv.data, requires_grad=True)
+    return X_adv.data
+
+
+def attack(model, X, attack_config, loss_func=nn.CrossEntropyLoss(), epsilon=0.00392, smooth_type='', network_type='nature'):
+    # method = attack_config.get('method', 'pgd')
+    # verbose = attack_config.get('verbose', False)
+    # params = attack_config.get('params', {})
+    method = 'pgd'
+    verbose = False
+    params = {
+        'epsilon': epsilon,
+        'network_type': network_type,
+    }
+    params['loss_func'] = loss_func
+
+    if network_type == 'noisynet':
+        model.model.sample()
+
+    if smooth_type == 'local':
+        _, output = model.forward(X, cert=False, return_q=True)
+    elif smooth_type == 'global':
+        _, output = model.forward(X, return_q=True)
+    else:
+        raise NotImplementedError(f'smooth_type = {smooth_type} not implemented!')
+
+    y = torch.argmax(output, dim=1)
+    # y = model.act(X, cert=False)
+    if method == 'cw':
+        atk = cw
+    elif method == 'rand':
+        atk = rand_attack
+    elif method == 'fgsm':
+        atk = fgsm
+    else:
+        atk = pgd
+    adv_X = atk(model, X, y, verbose=verbose, params=params)
+    abs_diff = abs(adv_X.cpu().numpy()-X.cpu().numpy())
+    if verbose:
+        print('adv image range: {}-{}, ori action: {}, adv action: {}, l1 norm: {}, l2 norm: {}, linf norm: {}'.format(torch.min(adv_X).cpu().numpy(), torch.max(adv_X).cpu().numpy(), model.act(X)[0], model.act(adv_X)[0], np.sum(abs_diff), np.linalg.norm(abs_diff), np.max(abs_diff)))
+    return adv_X
+
@@ -0,0 +1,11 @@
+Copyright 2020 Kaidi Xu, Zhouxing Shi, Huan Zhang
+
+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
+
+1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
+
+2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
+
+3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.