Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Authors: Yue Liu, Shengfang Zhai, Mingzhe Du, Yulin Chen, Tri Cao, Hongcheng Gao, Cheng Wang, Xinfeng Li, Kun Wang, Junfeng Fang, Jiaheng Zhang, Bryan Hooi
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the superiority of our model. Remarkably, it surpasses the runner-up by 19.27% F1 score on average, as shown in Figure 1. |
| Researcher Affiliation | Academia | Yue Liu1,2,3, Shengfang Zhai3, Mingzhe Du4,3 Yulin Chen3, Tri Cao3, Hongcheng Gao3, Cheng Wang3 Xinfeng Li4, Kun Wang4, Junfeng Fang3, Jiaheng Zhang3, Bryan Hooi2,3 1Integrative Sciences and Engineering Programme, NUS Graduate School 2Institute of Data Science, NUS 3Department of Computer Science, School of Computing, NUS 4School of Computer Science and Engineering, NTU EMAIL |
| Pseudocode | No | The paper describes the methodology in prose and mathematical formulas in Section 2, 'Guard Reasoner-VL', and Section 2.3 'Online Reinforcement Learning'. It does not contain a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | We release data, code, and models (3B/7B) of Guard Reasoner-VL1. 1https://github.com/yueliu1999/Guard Reasoner-VL |
| Open Datasets | Yes | First, we construct Guard Reasoner-VLTrain, a reasoning corpus with 123K samples and 631K reasoning steps, spanning text, image, and text-image inputs. |
| Dataset Splits | No | The paper states 'For this constructed image data, we use 80% for training and 20% for testing.' (Section 2.1) and 'We obtain training data for online RL, consisting of 12K samples.' (Appendix A.5.2). However, it does not explicitly provide the train/test/validation splits for the main Guard Reasoner-VLTrain dataset (123K samples) used for the initial SFT phase. |
| Hardware Specification | Yes | Environment. All experimental results are obtained on two servers with 8 NVIDIA H100 (80 GB) GPUs, and one server with 4 NVIDIA H200 (141GB) GPUs. |
| Software Dependencies | No | For SFT, we use the LLa MA Factory [98] training platform. For online RL, we use the Easy R1 [99] training platform. The paper names the platforms but does not specify their version numbers or other key software dependencies with versions (e.g., Python, PyTorch, CUDA). |
| Experiment Setup | Yes | The cutoff length is set to 2048 tokens. The initial learning rate is set to 5e-05, and we use the cosine learning rate scheduler. We use the BFloat16 training, and we adopt the full-parameter fine-tuning. We adopt Adam W optimizer. The number of epochs is set to 3. The total batch size is set to 192 = 8(accumulate step) 6(batch size) 4(device). The Deep Speed stage is set to 3. ... The number of rollouts is set to 16 and temperature = 1.2. The batch size of rollouts is set to 512. The batch size for the actor model is 256. The initial learning rate for the actor model is set to 1e-6, and the weight decay is set to 1e-2. The clipping ratio ̈ is set to 0.2. The length constrain ̂ is set to 1 for Guard Reasoner-VL, and 1/6 for Guard Reasoner-VL-Eco. |