Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Fit the Distribution: Cross-Image/Prompt Adversarial Attacks on Multimodal Large Language Models

Authors: Hai Yan, Haijian Ma, Xiaowen Cai, Daizong Liu, Zenghui Yuan, Xiaoye Qu, Jianfeng Dong, Runwei Guan, Xiang Fang, Hongyang He, Yulai Xie, Pan Zhou

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted to verify the strong adversarial capabilities of our proposed attack against prevalent MLLMs spanning a spectrum of images/prompts.
Researcher Affiliation	Academia	1Huazhong University of Science and Technology 2Wuhan University 3Zhejiang Gongshang University 4The Hong Kong University of Science and Technology 5Nanyang Technological University 6University of Warwick EMAIL EMAIL,EMAIL,EMAIL
Pseudocode	Yes	Algorithm 1 Cross-Image/Prompt Adversarial Attack against MLLMs
Open Source Code	No	Answer: [No] Justification: We will release the codes upon acceptance.
Open Datasets	Yes	The image data are sourced from the MS-COCO dataset [81] and the DALLE-3 dataset [2]. In addition to the MS-COCO dataset [81] and DALLE-3 dataset [2], We also conduct experiments on the SVIT [90], Flickr30K [91] and No Caps [92] datasets.
Dataset Splits	Yes	To assess adversarial generalization, three cross-sample settings are considered: (1) 30 distinct prompts sampled per image (Cross-Prompt), (2) 50 different images sampled per prompt (Cross-Image), and (3) simultaneous sampling of 30 images and 30 prompts (Cross-Image/Prompt).
Hardware Specification	Yes	All experiments are conducted on the NVIDIA A800 GPUs with 80GB of memory.
Software Dependencies	No	The MLLMs used are LLa VA-1.5-7B-hf [6], BLIP-2 OPT-2.7B [5], and Mini GPT-4 [82], chosen to represent a diverse range of architectures and model scales among current MLLMs. ...obtained from the all-Mini LM-L6-v2 model [83].
Experiment Setup	Yes	For the parameter calculation of image/prompt distribution, the coefficient β, r are set to 0.9 and 0.001. The target answer is uniformly specified as I am sorry. in all experimental conditions. ... The perturbations are optimized for 300 steps under η = 16/255 and step size α = 1/255.