reproducibilityindex.ai

Robustness of AI-Image Detectors: Fundamental Limits and Practical Attacks

Authors: Mehrdad Saberi, Vinu Sankar Sadasivan, Keivan Rezaei, Aounon Kumar, Atoosa Chegini, Wenxiao Wang, Soheil Feizi

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate our theoretical findings, we also provide empirical evidence demonstrating that diffusion purification effectively removes low perturbation budget watermarks by applying minimal changes to images. Finally, we extend our theory to characterize a fundamental trade-off between the robustness and reliability of classifier-based deep fake detectors and demonstrate it through experiments.
Researcher Affiliation	Academia	Mehrdad Saberi1, Vinu Sankar Sadasivan1, Keivan Rezaei1, Aounon Kumar1, Atoosa Chegini1, Wenxiao Wang1, Soheil Feizi1 1Department of Computer Science, University of Maryland {msaberi,vinu,krezaei,aounon,atoocheg,wwx,sfeizi}@umd.edu
Pseudocode	Yes	We provide the pseudocode for spoofing watermarks in Algorithm 1.
Open Source Code	Yes	Code is available at https://github.com/mehrdadsaberi/watermark robustness.
Open Datasets	Yes	Our evaluation is conducted on a set of 100 images drawn from the Image Net dataset (Russakovsky et al., 2015), and their watermarked counterparts using each method. We perform experiments on the images from the Face Forensics++ dataset hosted by R ossler et al. (2019) to verify our theoretical insights empirically.
Dataset Splits	Yes	Our substitute classifiers are trained for 10 epochs and receive higher than 99.8% accuracy on validation data. After preprocessing, our Face Swap image dataset contains 4316 (1059, respectively) original and 3529 (1857, respectively) manipulated images in the training (test, respectively) dataset. Similarly, our Deep Fakes image dataset contains 4316 (1059, respectively) original and 3522 (1843, respectively) manipulated images in the training (test, respectively) dataset.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory, or processor types with speeds) used for running its experiments. While experiments imply the use of computational resources, no concrete specifications are listed.
Software Dependencies	No	The paper mentions the use of models (e.g., ResNet-18, VGG-16-BN, Image Net pretrained models) and general frameworks (e.g., diffusion models), but does not list specific version numbers for any software dependencies, libraries, or frameworks required to reproduce the experiments (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	Our substitute classifiers are trained for 10 epochs and receive higher than 99.8% accuracy on validation data. To launch adversarial attacks on images using substitute classifiers, we employ a PGD attack with 300 iterations and a step size denoted as α = 0.05ϵ. We train different detectors with the standard deviation of noise σ varied from 0 to 20.