Raising the Cost of Malicious AI-Powered Image Editing

Authors: Hadi Salman, Alaa Khaddaj, Guillaume Leclerc, Andrew Ilyas, Aleksander Madry

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We examine the effectiveness of our proposed immunization method. Setup. We focus on the Stable Diffusion Model (SDM) v1.5 (Rombach et al., 2022), though our methods can be applied to other diffusion models too. In each of the following experiments, we aim to disrupt the performance of SDM by adding imperceptible noise (using either of our proposed attacks) i.e., applying our immunization procedure to a variety of images. The goal is to force the model to generate images that are unrealistic and unrelated to the original (immunized) image. We evaluate the performance of our method both qualitatively (by visually inspecting the generated images) and quantitatively (by examining the image quality using standard metrics). We defer further experimental details to Appendix A. Table 1: We report various image quality metrics measuring the similarity between edits originating from immunized vs. non-immunized images.
Researcher Affiliation Academia Hadi Salman * 1 Alaa Khaddaj * 1 Guillaume Leclerc * 1 Andrew Ilyas 1 Aleksander M adry 1 *Equal contribution 1MIT. Correspondence to: Hadi Salman <hady@mit.edu>.
Pseudocode Yes Algorithm 1 Encoder Attack on a Stable Diffusion Model. Algorithm 2 Diffusion Attack on a Stable Diffusion Model.
Open Source Code Yes 1Code is available at https://github.com/Madry Lab/photoguard.
Open Datasets Yes We focus on the Stable Diffusion Model (SDM) v1.5 (Rombach et al., 2022)... This model is available on: https://huggingface.co/runwayml/stable-diffusion-v1-5.
Dataset Splits No The paper describes experiments performed on a pre-trained Stable Diffusion Model. It does not specify any train/validation/test splits for its own experimental setup or data collection.
Hardware Specification Yes We used an A100 with 40 GB memory. Work partially done on the MIT Supercloud compute cluster (Reuther et al., 2018).
Software Dependencies No The paper mentions using 'Stable Diffusion Model (SDM) v1.5' and a 'pretrained CLIP model', but it does not list multiple key software components with specific version numbers (e.g., programming languages, libraries, frameworks like PyTorch or TensorFlow versions).
Experiment Setup Yes Table 2: Hyperparameters used for the Stable Diffusion model. Table 3: Hyperparameters used for the adversarial attacks.