Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning to Steer: Input-dependent Steering for Multimodal LLMs

Authors: Jayneel Parekh, Pegah KHAYATAN, Mustafa Shukor, Arnaud Dapogny, Alasdair Newson, Matthieu Cord

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we first discuss generic experimental setup considerations 4.1 to ensure reproducibility of the results. Then we present results for application of L2S for safety enforcement in MLLMs (Section 4.2) as well as hallucination mitigation (Section 4.3). Quantitative results We report the safety steering results in Table 1.
Researcher Affiliation	Collaboration	1ISIR, Sorbonne Université, Paris, France 2Valeo.ai, Paris, France
Pseudocode	No	No explicit pseudocode or algorithm block is present in the paper. The methodology is described in Section 3.
Open Source Code	Yes	Our code is publicly available.1, 2 1Github page: https://github.com/jayneelparekh/learn-to-steer
Open Datasets	Yes	The MMSafety Bench [36] database provides multimodal queries (image and text) to assess the security of MLLMs. For hallucination mitigation, we benchmark on the POPE dataset [28]. We further evaluate L2S on 500 randomly sampled images from the COCO validation set [29]
Dataset Splits	Yes	We use a random split of 80% of data for training/learning the steering vectors and 20% for testing. L2S is trained and tested on balanced subsets containing 70%, 10% and 20% of data for training, validation and test, respectively.
Hardware Specification	Yes	All experiments are conducted on a single RTX5000 (24GB) GPU.
Software Dependencies	No	The paper mentions using specific models (LLaVA-v1.5-7B, Qwen2-VL-7B, Llama-Guard-3-8B, Gemini-2.0-Flash) but does not explicitly provide version numbers for underlying software libraries or programming languages like Python, PyTorch, or CUDA.
Experiment Setup	Yes	The auxiliary network gΘ for L2S consists in a single 2-layers MLP with hidden size 100, and is trained for 100 epochs using the Adam optimizer with either a learning rate of 10-4 or 5x10-5 as well as a batch size of 64. We use a cosine learning rate scheduler with warmup, followed by an adaptive scheduler that reduces the learning rate when the validation performance plateaus.