Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Exploring Semantic-constrained Adversarial Example with Instruction Uncertainty Reduction

Authors: Jin Hu, Jiakai Wang, linna Jing, Haolin Li, Liu haodong, Haotong Qin, Aishan Liu, Ke Xu, Xianglong Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the superiority of the transfer attack performance of In SUR. Besides, it is worth highlighting that we realize the reference-free generation of semantically constrained 3D adversarial examples by utilizing language-guided 3D generation models for the first time.
Researcher Affiliation Academia 1State Key Laboratory of Complex & Critical Software Environment (CCSE), Beihang University 2Zhongguancun Laboratory 3School of Computer Science and Engineering, Beihang University 4Department of Information Technology and Electrical Engineering, ETH Zurich EMAIL EMAIL
Pseudocode Yes Algorithm 1: Res Adv-DDIM
Open Source Code No For responsibility, we will release the code of In SUR framework after the paper is published for reference.
Open Datasets Yes We evaluate different generation methods by generating 6 samples for each label in the Image Net 1000-class label evasion task and the proposed abstracted label evasion task. ... based on the hyponymic relation defined by Word Net [44] taxonomy.
Dataset Splits No We evaluate different generation methods by generating 6 samples for each label in the Image Net 1000-class label evasion task and the proposed abstracted label evasion task.
Hardware Specification Yes 2D / 3D generation times are benchmarked on a single 4090 or A800 GPU, respectively, by generating 100 samples with abstracted labels, and are presented in the results with standard deviation.
Software Dependencies No The paper uses various models and frameworks (e.g., CLIP [54] model RN50, stabilityai/stable-diffusion-2-1-base [55], LLa VA [56] and Qwen-7B [57], Trellis [43]), but does not specify version numbers for general software dependencies like PyTorch, TensorFlow, or CUDA.
Experiment Setup Yes In Table 6, we present a parameter analysis of ΞΎ1 and ΞΎ2 using Res Net50 as the surrogate model on the Abstract Label Evasion Task. ...The default configuration for other parameters is coherent with related baselines, i.e., Ξ² = 0.5, s = 0.7, T = 100.