The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks

Authors: Ziquan Liu, Yufei Cui, Yan Yan, Yi Xu, Xiangyang Ji, Xue Liu, Antoni B. Chan

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical study on four image classification datasets across three popular AT baselines validates the effectiveness of the proposed Uncertainty-Reducing AT (AT-UR).
Researcher Affiliation Academia 1Queen Mary University of London 2Mc Gill University, Mila 3Washington State University 4Dalian University of Technology 5Tsinghua University 6City University of Hong Kong.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https: //github.com/ziquanliu/ICML2024-AT-UR.
Open Datasets Yes Four datasets are used to evaluate our method, i.e., CIFAR10, CIFAR100 (Krizhevsky et al., 2009), Caltech256 (Griffin et al., 2007) and Caltech-UCSD Birds-200-2011 (CUB200) (Wah et al., 2011).
Dataset Splits Yes CIFAR10 and CIFAR100 contain low-resolution images of 10 and 100 classes, where the training and validation sets have 50,000 and 10,000 images respectively. Caltech-256 has 30,607 high-resolution images and 257 classes, which are split into a training and a validation set using a 9:1 ratio. CUB200 also contains highresolution bird images for fine-grained image classification, with 200 classes, 5,994 training images and 5,794 validation images. We fix the training set in our experiment and randomly split the original test set into calibration and test set with a ratio of 1:4 for conformal prediction.
Hardware Specification No The paper describes experimental settings related to models, datasets, training, and attacks, but does not specify the hardware used (e.g., GPU models, CPU types, memory).
Software Dependencies No The paper mentions 'numpy (Harris et al., 2020)' but does not specify version numbers for numpy or any other key software dependencies (e.g., deep learning frameworks like PyTorch or TensorFlow, CUDA versions).
Experiment Setup Yes The PGD attack has 10 steps, with stepsize λ = 2/255 and attack budget ϵ = 8/255. The batch size is set as 128 and the training epoch is 60. We divide the learning rate by 0.1 at the 30th and 50th epoch. We set λEM=0.3 in all of our experiments... We use a = 1.1 and search b from the discrete set {2.0, 3.0, 4.0, 5.0}. The learning rate and weight decay of AT, FAT and TRADES are determined by grid search from {1e-4,3e-4,1e-3,3e-3,1e-2} and {1e-3,1e-4,1e-5} respectively. For TRADES, we follow the default setting β = 6.0 for the KL divergence term.