Data Poisoning Attacks against Conformal Prediction

Authors: Yangyi Li, Aobo Chen, Wei Qian, Chenxu Zhao, Divya Lidder, Mengdi Huai

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To bridge this gap, for the first time, we in this paper propose a new class of black-box data poisoning attacks against CP, where the adversary aims to cause the desired manipulations of some specific examples prediction uncertainty results (instead of misclassifications). Additionally, we design novel optimization frameworks for our proposed attacks. Further, we conduct extensive experiments to validate the effectiveness of our attacks on various settings (e.g., the full and split CP settings). Notably, our extensive experiments show that our attacks are more effective in manipulating uncertainty results than traditional poisoning attacks that aim at inducing misclassifications, and existing defenses against conventional attacks are ineffective against our proposed attacks.
Researcher Affiliation Academia 1Department of Computer Science, Iowa State University, United States. Correspondence to: Mengdi Huai <mdhuai@iastate.edu>.
Pseudocode No The paper describes mathematical formulations and optimization procedures in text, but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link to open-source code for the methodology described.
Open Datasets Yes In experiments, we adopt the following image classification datasets: Tiny-Image Net (Deng et al., 2009) and CIFAR-10/100 (Krizhevsky et al.).
Dataset Splits Yes In experiments, we allocate 10% data for calibration and maintain a default coverage rate (1 ε) of 0.9.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions "utilizing the SGD optimizer" but does not specify software dependencies with version numbers (e.g., specific Python libraries like PyTorch or TensorFlow versions).
Experiment Setup Yes In experiments, we allocate 10% data for calibration and maintain a default coverage rate (1 ε) of 0.9. We limit the perturbation bound ϵ to 16/255. The poisoning attacks are implemented through training the models from scratch (Huang et al., 2020; Huai et al., 2020b), utilizing the SGD optimizer with a learning rate of 0.01 and a batch size of 128. We evaluate the attack results in each experiment by randomly sampling a target class. We generate poisons and evaluate them on 8 newly initialized victim models. We repeat each experiment 10 times and report the mean and standard errors.