Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

DualCnst: Enhancing Zero-Shot Out-of-Distribution Detection via Text-Image Consistency in Vision-Language Models

Authors: Fayi Le, Wenwu He, Chentao Cao, Dong Liang, Zhuo-Xu Cui

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments across diverse OOD benchmarks demonstrate that Dual Cnst achieves state-of-the-art performance while remaining scalable, data-agnostic, and fully compatible with prior text-only VLM-based methods.
Researcher Affiliation	Academia	1School of Computer Science and Mathematics, Fujian University of Technology 2Fujian Provincial Key Laboratory of Big Data Mining and Applications 3Department of Computer Science, Hong Kong Baptist University 4Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
Pseudocode	Yes	Algorithm 1 Zero-shot OOD detection with text-image dual consistency
Open Source Code	Yes	The code is publicly available at: https: //github.com/TMLSIAT/Dual Cnst.
Open Datasets	Yes	For our experiments, we use Image Net-1k [30] as the primary ID dataset. OOD datasets include i Naturalist [11], SUN [31], Places [32], and Textures [33], which cover a wide variety of scenes and semantic categories.
Dataset Splits	Yes	When Image Net-10 is treated as the in-distribution (ID) dataset and Image Net-20 as OOD... The subset splits and ID label configurations follow MCM [9]. For a fair comparison, we reproduce Neg Label and MCM under the same protocol.
Hardware Specification	Yes	This paper introduces a dual consistency (Dual Cnst) method, implemented using Python 3.8 and Py Torch 1.13 library [65], with all experiments conducted on a single NVIDIA RTX A6000 GPU.
Software Dependencies	Yes	This paper introduces a dual consistency (Dual Cnst) method, implemented using Python 3.8 and Py Torch 1.13 library [65]
Experiment Setup	Yes	The default hyperparameter settings are as follows: We set w = 0.1 and extract intermediate-layer features from the 9th, 10th, and 11th layers of the visual encoder, which are then fused with the final semantic features. The sum-softmax score is employed, with the fusion parameter set to α = 0.1 and the temperature parameter to τ = 0.01.