UNICORN: A Unified Backdoor Trigger Inversion Framework
Authors: Zhenting Wang, Kai Mei, Juan Zhai, Shiqing Ma
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our prototype UNICORN is general and effective in inverting backdoor triggers in DNNs. The code can be found at https://github.com/ RU-System-Software-and-Security/UNICORN. ... Based on the devised optimization problem, we implemented a prototype UNICORN (Unified Backdoor Trigger Inversion) in Py Torch and evaluated it on nine different models and eight different backdoor attacks (i.e., Patch attack (Gu et al., 2017), Blend attack (Chen et al., 2017), SIG (Barni et al., 2019), moon filter, kelvin filter, 1977 filter (Liu et al., 2019), Wa Net (Nguyen & Tran, 2021) and Bpp Attack (Wang et al., 2022c)) on CIFAR-10 and Image Net dataset. Results show UNICORN is effective for inverting various types of backdoor triggers. On average, the attack success rate of the inverted triggers is 95.60%, outperforming existing trigger inversion methods. |
| Researcher Affiliation | Academia | Zhenting Wang, Kai Mei, Juan Zhai, Shiqing Ma Department of Computer Science, Rutgers University {zhenting.wang,kai.mei,juan.zhai,sm2283}@rutgers.edu |
| Pseudocode | No | The paper formulates an optimization problem (Eq. 4 and 5) and describes its implementation, but it does not include a separately labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The code can be found at https://github.com/ RU-System-Software-and-Security/UNICORN. |
| Open Datasets | Yes | Two publicly available datasets (i.e., CIFAR-10 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015)) are used to evaluate the effectiveness of UNICORN. The details of the datasets can be found in the A.4. ... CIFAR10 (Krizhevsky et al., 2009). This dataset is built for recognizing general objects such as dogs, cars, and planes. It contains 50000 training samples and 10000 training samples in 10 classes. Image Net (Russakovsky et al., 2015). This dataset is a large-scale object classification benchmark. In this paper, we use a subset of the original Image Net dataset specified in Li et al. (Li et al., 2021d). The subset has 100000 training samples and 10000 test samples in 200 classes. |
| Dataset Splits | No | The paper specifies training and test sample counts for CIFAR-10 and ImageNet datasets (e.g., '50000 training samples and 10000 training samples' for CIFAR-10, likely a typo for test samples; '100000 training samples and 10000 test samples' for ImageNet). However, it does not explicitly provide a distinct 'validation' dataset split with specific numbers or percentages. |
| Hardware Specification | Yes | We conduct all experiments on a Ubuntu 20.04 server equipped with 64 CPUs and six Quadro RTX 6000 GPUs. |
| Software Dependencies | Yes | Our method is implemented with python 3.8 and Py Torch 1.11. |
| Experiment Setup | Yes | By default, we set α = 0.01, β as 10% of the input space, γ = 0.85, and δ = 0.5. |