Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Neighbor-aware Contrastive Disambiguation for Cross-Modal Hashing with Redundant Annotations
Authors: Chao Su, Likang Peng, Yuan Sun, Dezhong Peng, Xi Peng, Xu Wang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on three large-scale multimodal benchmarks demonstrate that our method consistently outperforms state-of-the-art approaches, thereby establishing a new standard for cross-modal hashing with redundant annotations. |
| Researcher Affiliation | Academia | 1College of Computer Science, Sichuan University, Chengdu, China 2National Key Laboratory of Fundamental Algorithms and Models for Engineering Numerical Simulation, Sichuan University, Chengdu, China 3Tianfu Jincheng Laboratory, Chengdu, China 4Centre for Frontier AI Research (CFAR), A*STAR, Singapore |
| Pseudocode | Yes | Algorithm 1 Optimization Algorithm for NACD |
| Open Source Code | Yes | Code is available at https://github.com/Rose-bud/NACD. |
| Open Datasets | Yes | To evaluate the effectiveness of NACD, we conducted experiments on three benchmark datasets: MIRFlickr-25k (Flickr) [42], NUS-WIDE (NUS) [43], and MS-COCO (COCO) [44]. |
| Dataset Splits | Yes | Following dataset partition strategies adopted in prior works [22, 30], we have structured the datasets accordingly: For MIRFlickr-25K [42], we select 2,000 data points as the test (query) dataset. The remaining data points are used to form the retrieval (database) dataset. From this, we further identified a training subset comprising 10,000 data points. In the case of NUS-WIDE [43], the test (query) dataset is made up of 2,100 data points. The remaining data points form the retrieval (database) dataset. From this dataset, we select 10,500 data points for training. For MS-COCO [44], we extract 5,000 data points to be used for testing. The rest of the data points are pooled into the retrieval (database) dataset. From this, we set aside 10,000 data points specifically for training. Table 3 shows the specific data split information of these three multimodal datasets in our experiments. |
| Hardware Specification | Yes | Our NACD is implemented using the Py Torch framework [48] and all experiments are carried out with 4 NVIDIA V100 GPUs. |
| Software Dependencies | No | Our NACD is implemented using the Py Torch framework [48] and all experiments are carried out with 4 NVIDIA V100 GPUs. |
| Experiment Setup | Yes | The model is trained using the RMSprop optimizer [47], with an initial learning rate of 1e 5 and a maximum of 100 epochs. The parameters Ξ΄ and ΞΎ in Eq. (8) are set to 0.2 and 1.0, respectively. Additionally, we employ a batch size n of 128. The model is evaluated every 20 epochs, with the first 10 epochs serving as a warm-up phase during which the CM strategy and class-wise threshold update are disabled. The number of neighbors C is set to 20 to ensure accurate neighborhood information. |