Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations

Authors: Buyun Liang, Liangzu Peng, Jinqi Luo, Darshan Thaker, Kwan Ho Ryan Chan, Rene Vidal

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate through experiments on open-ended multiple-choice question answering tasks that SECA achieves higher attack success rates while incurring almost no semantic equivalence or semantic coherence errors compared to existing methods. SECA highlights the sensitivity of both open-source and commercial gradient-inaccessible LLMs to realistic and plausible prompt variations.
Researcher Affiliation Academia Buyun Liang Liangzu Peng Jinqi Luo Darshan Thaker Kwan Ho Ryan Chan RenΓ© Vidal University of Pennsylvania
Pseudocode Yes Algorithm 1 Semantically Equivalent and Coherent Attacks (SECA)
Open Source Code Yes Code is available at https://github.com/Buyun-Liang/SECA.
Open Datasets Yes We evaluate our approach with the commonly used MMLU dataset [19]
Dataset Splits No We evaluate our approach with the commonly used MMLU dataset [19], where each sample consists of multiple-choice questions and the correct answer. However, some questions in this dataset might already induce hallucinations of a target LLM. To isolate the effect of this, we only consider the questions for which the target LLMs produce correct answers. To do so, we create a filtered subset of MMLU, where each prompt is included if and only if all target LLMs assign the highest confidence to the correct answer token. After this filtering, the resulting dataset contains 347 samples and spans 16 diverse subjects such as science, engineering, and arts7.
Hardware Specification Yes All experiments were conducted using four NVIDIA A5000 GPUs, each with 24.5 GB of memory.
Software Dependencies No The paper mentions several LLMs like GPT-2 and PyTorch-related libraries in references or for specific baseline (GCG), but does not provide specific version numbers for the software dependencies used in their own methodology (SECA).
Experiment Setup Yes For SECA, we set the hyperparameters as follows: M = 3, N = 3, max_iteration=30, and termination_threshold = 1.0. The key hyperparameters for all target LLMs are: top_p = 1.0 and temperature = 1.0. To ensure reproducibility, we set seed = 42. Throughout all experiments, we fix the tolerance at Ξ³ = 60 to permit a small degree of incoherence, reflecting what is commonly observed in human-generated prompts. We initialize the candidate set with N copies of the original prompt x0 (Line 2).