On Attacking Out-Domain Uncertainty Estimation in Deep Neural Networks

Authors: Huimin Zeng, Zhenrui Yue, Yang Zhang, Ziyi Kou, Lanyu Shang, Dong Wang

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on various benchmark image datasets show that the uncertainty estimated by state-of-the-art methods could be easily corrupted by our attack. 5 Experiments We evaluate the efficacy of our proposed out-domain uncertainty attack by assessing to which extent, the victim model could be deceived to make high-confident predictions for perturbed out-domain data.
Researcher Affiliation Academia Huimin Zeng1 , Zhenrui Yue1 , Yang Zhang2 , Ziyi Kou1 , Lanyu Shang1 , Dong Wang1 1Unversity of Illinois at Urbana-Champaign 2University of Notre Dame
Pseudocode Yes Algorithm 1 Perturbing out-domain data
Open Source Code No 2Code will be released after acceptance of this paper.
Open Datasets Yes Following the experimental design presented in [Lakshminarayanan et al., 2016; Van Amersfoort et al., 2020; van Amersfoort et al., 2021] we use MNIST [Le Cun et al., 1998] vs. Not MNIST 1, and CIFAR-10 [Krizhevsky and Hinton, 2009] vs.SVHN [Netzer et al., 2011] to test the efficacy of our proposed attack on corrupting the out-domain uncertainty estimation algorithms in DNNs.
Dataset Splits No The paper describes using in-domain data for training and out-domain data for testing, following common experimental setups, but does not provide explicit details on validation splits, percentages, or methodology for generating them.
Hardware Specification No The paper mentions using deep neural networks and various models but does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or library versions).
Experiment Setup Yes In our experiments, two different training adversarial radius (ϵtr) are used. For MNIST vs. Not MNIST, adversarial training with ϵtr = 0.1 and ϵtr = 0.2 are performed, whereas for CIFAR10 vs. SVHN, the training adversarial radius ϵtr is set to 0.016 and 0.031, respectively. Moreover, to attack Not MNIST images, the adversarial radius ϵ is set to 0.1, and ϵ = 0.016 for SVHN images. Also, for robustness study, ϵ is changed from 0.1 (corresponding to Hadv and Radv) to 0.2 (corresponding to H adv and R adv) and 0.3 (corresponding to H adv and R adv) for Not MNIST. Similarly, for SVHN, ϵ is changed from 0.016 to 0.031 and 0.063.