Distilling Cognitive Backdoor Patterns within an Image

Authors: Hanxun Huang, Xingjun Ma, Sarah Monazam Erfani, James Bailey

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments to show that CD can robustly detect a wide range of advanced backdoor attacks.
Researcher Affiliation Academia 1School of Computing and Information Systems, The University of Melbourne, VIC, Australia 2School of Computer Science, Fudan University, Shanghai, China
Pseudocode Yes Algorithm 1 Unlearning and Fine-tuning
Open Source Code Yes Code is available at https://github.com/Hanxun H/Cognitive Distillation.
Open Datasets Yes We perform evaluations on 3 datasets, including CIFAR-10 (Krizhevsky et al., 2009), an Image Net (Deng et al., 2009) subset (200 classes), and GTSRB (Houben et al., 2013). Also, Celeb A (Liu et al., 2015) is used for bias detection.
Dataset Splits No The paper evaluates performance on training and test sets, but does not explicitly detail a separate validation split for model training.
Hardware Specification Yes All experiments are run with NVIDIA Tesla P100/V100/A100 GPUs with Py Torch implementations.
Software Dependencies No The paper mentions 'Py Torch implementations' but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes For CIFAR10 experiments, we train for 60 epochs with SGD optimizer, weight decay 5 10 4, initial learning rate 0.1 decay by 0.1 at the 45th epoch. For our CD method, we use Adam optimizer (Kingma & Ba, 2014) with initial learning rate 0.1, β1=0.1, β2=0.1, and a total of 100 steps to learn the input mask. We set pb to 2.5% and pc to 70%, and optimize for 5 epochs with learning rate set to 5 10 4.