Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences Alignment

Authors: Kaixun Jiang, Zhaoyu Chen, HaiJing Guo, Jinglun Li, Jiyuan Fu, Pinxue Guo, Hao Tang, Bo Li, Wenqiang Zhang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that APA achieves significantly better attack transferability while maintaining high visual consistency, inspiring further research to approach adversarial attacks from an alignment perspective. 5 Experiments 5.1 Experimental Settings 5.2 Attack Performance Comparison 5.3 Attacks on Adversarial Defense 5.4 Visual Quality Comparison 5.5 Ablation Studies Table 1: Attack performance comparison on normally trained CNNs and Vi Ts.
Researcher Affiliation Collaboration Kaixun Jiang1, Zhaoyu Chen1 , Haijing Guo2, Jinglun Li1, Jiyuan Fu2, Pinxue Guo1, Hao Tang3, Bo Li4, Wenqiang Zhang1,2 1College of Intelligent Robotics and Advanced Manufacturing, Fudan University 2Shanghai Key Lab of Intelligent Information Processing, College of Computer Science and Artificial Intelligence, Fudan University 3School of Computer Science, Peking University 4vivo Mobile Communication Co., Ltd.
Pseudocode Yes Figure 2 and Alogrithim 1 present the overall framework of APA. C Pseudo Code of APA We provide pseudo code of our APA framework in Algorithm 1 and Symbol Table in Table 5. Algorithm 1: Our APA Framework
Open Source Code Yes Code is available at https://github.com/deep-kaixun/APA.
Open Datasets Yes Datasets and Models. We choose the widely used Image Net-compatible Dataset [34], consisting of 1,000 images from Image Net s validation set [14]. 5. Open access to data and code Answer: [Yes] Justification: We provide the code in the supplementary material. The datasets used (e.g., Image Net-compatible dataset) are publicly available, as referenced in the paper.
Dataset Splits No The paper mentions using "1,000 images from Image Net s validation set" for evaluation. However, it does not specify explicit training, validation, or test splits for its own model's development or fine-tuning. The LoRA fine-tuning is done "with each clean image," which means on individual images rather than a dataset split.
Hardware Specification Yes Our empirical evaluation on an NVIDIA A100 GPU shows that visual consistency alignment requires 38 seconds, while attack effectiveness alignment (APA-SG) takes 58.5 seconds.
Software Dependencies No The paper mentions: "Our work is based on Stable Diffusion V1.5 [60]." However, it does not provide specific version numbers for other key software components like Python, PyTorch, or CUDA.
Experiment Setup Yes Implementation Details. We set attack guidance step Ta = 10 , attack iterations N = 10 , attack scale ฯตa = 0.4 , and attack step size ยต = 0.04. APA-SG adopts the entire inversion step of T = 50. APA-GC adopts T = 10 to improve efficiency. Our work is based on Stable Diffusion V1.5 [60]. During the visual consistency alignment phase, we fine-tune only the projection matrices Q, K, and V in the attention modules of the UNet with each clean image. With the Lo RA rank set to 8, we train for 200 steps. Experimental results indicate that APA-GC delivers strong attack performance. Thus, we add visual consistency constraints to APA-GC s Ra without concerns about impacting attack performance, setting Ra = Ra ฮป z0 z0 2 with ฮป = 10.