Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Neighbor-aware Contrastive Disambiguation for Cross-Modal Hashing with Redundant Annotations

Authors: Chao Su, Likang Peng, Yuan Sun, Dezhong Peng, Xi Peng, Xu Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted on three large-scale multimodal benchmarks demonstrate that our method consistently outperforms state-of-the-art approaches, thereby establishing a new standard for cross-modal hashing with redundant annotations.
Researcher Affiliation	Academia	1College of Computer Science, Sichuan University, Chengdu, China 2National Key Laboratory of Fundamental Algorithms and Models for Engineering Numerical Simulation, Sichuan University, Chengdu, China 3Tianfu Jincheng Laboratory, Chengdu, China 4Centre for Frontier AI Research (CFAR), A*STAR, Singapore
Pseudocode	Yes	Algorithm 1 Optimization Algorithm for NACD
Open Source Code	Yes	Code is available at https://github.com/Rose-bud/NACD.
Open Datasets	Yes	To evaluate the effectiveness of NACD, we conducted experiments on three benchmark datasets: MIRFlickr-25k (Flickr) [42], NUS-WIDE (NUS) [43], and MS-COCO (COCO) [44].
Dataset Splits	Yes	Following dataset partition strategies adopted in prior works [22, 30], we have structured the datasets accordingly: For MIRFlickr-25K [42], we select 2,000 data points as the test (query) dataset. The remaining data points are used to form the retrieval (database) dataset. From this, we further identified a training subset comprising 10,000 data points. In the case of NUS-WIDE [43], the test (query) dataset is made up of 2,100 data points. The remaining data points form the retrieval (database) dataset. From this dataset, we select 10,500 data points for training. For MS-COCO [44], we extract 5,000 data points to be used for testing. The rest of the data points are pooled into the retrieval (database) dataset. From this, we set aside 10,000 data points specifically for training. Table 3 shows the specific data split information of these three multimodal datasets in our experiments.
Hardware Specification	Yes	Our NACD is implemented using the Py Torch framework [48] and all experiments are carried out with 4 NVIDIA V100 GPUs.
Software Dependencies	No	Our NACD is implemented using the Py Torch framework [48] and all experiments are carried out with 4 NVIDIA V100 GPUs.
Experiment Setup	Yes	The model is trained using the RMSprop optimizer [47], with an initial learning rate of 1e 5 and a maximum of 100 epochs. The parameters δ and ξ in Eq. (8) are set to 0.2 and 1.0, respectively. Additionally, we employ a batch size n of 128. The model is evaluated every 20 epochs, with the first 10 epochs serving as a warm-up phase during which the CM strategy and class-wise threshold update are disabled. The number of neighbors C is set to 20 to ensure accurate neighborhood information.