Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Generalized Contrastive Learning for Universal Multimodal Retrieval

Authors: Jungsoo Lee, Janghoon Cho, Hyojin Park, Durga Malladi, Kyuwoong Hwang, Fatih Porikli, Sungha Choi

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of GCL by showing consistent performance improvements on off-the-shelf multimodal retrieval models (e.g.VISTA, CLIP, and Tiny CLIP) using the M-BEIR, MMEB, and Co VR benchmarks. 4 Experiments
Researcher Affiliation Industry Jungsoo Lee Janghoon Cho Hyojin Park Munawar Hayat Kyuwoong Hwang Fatih Porikli Sungha Choi Qualcomm AI Research EMAIL
Pseudocode Yes Question: For each theoretical result, does the paper provide the full set of assumptions and a complete (and correct) proof? Answer: [NA] Justification: We do not include theoretical assumptions and proofs in our paper. Guidelines: ... Question: Does the paper fully disclose all the information needed to reproduce the main experimental results of the paper to the extent that it affects the main claims and/or conclusions of the paper (regardless of whether the code and data are provided or not)? Answer: [Yes] Justification: We included the implementation details regarding our experiments in Section 4.1 and our Supplementary. We also include the pseudocodes in our Supplementary in order to help readers reproduce our algorithm.
Open Source Code No Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: We are currently refactoring and cleaning the codes. We are planning to release the codes after internal review of the codes is finalized.
Open Datasets Yes We evaluate the effectiveness of our proposed GCL using standard multimodal retrieval benchmarks: M-BEIR[9], MMEB[27], and Co VR [28] 3. ... For the image-caption paired dataset used during finetuning, we use the LLa VA Visual Instruct Pretrain LCS-558K dataset [35] ... 3M-BEIR and Co VR datasets are under the MIT license, and MMEB is under the Apache-2.0 license.
Dataset Splits Yes We evaluate the effectiveness of our proposed GCL using standard multimodal retrieval benchmarks: M-BEIR[9], MMEB[27], and Co VR [28] 3. ... Note that our experiments are conducted in a zero-shot setting, meaning the model is not fine-tuned on the training set of the evaluation benchmark.
Hardware Specification Yes Question: For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [Yes] Justification: We described the computing resources including the type of GPU and memory consumed for the experiments in our Supplementary.
Software Dependencies Yes Question: Does the paper provide SPECIFIC ANCILLARY SOFTWARE DETAILS (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment? Answer: [Yes] Justification: We included the implementation details regarding our experiments in Section 4.1 and our Supplementary.
Experiment Setup Yes Question: Does the paper specify all the training and test details (e.g., data splits, hyperparameters, how they were chosen, type of optimizer, etc.) necessary to understand the results? Answer: [Yes] Justification: Yes, we specified the training and test details in our implementation details of Section 4.1 of the main paper and Supplementary.