Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Towards General Modality Translation with Contrastive and Predictive Latent Diffusion Bridge

Authors: Nimrod Berman, Omkar Joglekar, Eitan Kosman, Dotan Di Castro, Omri Azencot

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present comprehensive experiments that evaluate our general-purpose framework for modality translation across a diverse set of tasks and domains. We benchmark our approach on four tasks and perform extensive ablation studies to assess the contribution of architectural components, loss functions, and training strategies.
Researcher Affiliation	Collaboration	Nimrod Berman 1,2 Omkar Joglekar 1,3 Eitan Kosman1 Dotan Di Castro1 Omri Azencot2 1Bosch AI Center 2Ben-Gurion University of the Negev 3Technical University of Munich
Pseudocode	No	The paper describes the architecture with figures (Fig. 2, Fig. 5) and textual explanations, but it does not contain a formally labeled pseudocode or algorithm block.
Open Source Code	No	We attach code example for the review process and publish the code upon accaptence.
Open Datasets	Yes	We evaluate our proposed approach using the Shape Net dataset [64] following the protocol established in [55]...The model is trained on the Flickr Faces-HQ (FFHQ) dataset [28] and evaluated on samples from the Celeb A-HQ dataset [26]...The nu Scenes-Occupancy The dataset serves as a 3D occupancy prediction benchmark derived from the nu Scenes autonomous driving dataset.
Dataset Splits	Yes	We split the dataset by randomly assigning 70% of the objects to the training set, 10% to the validation set, and the remaining 20% to the test set, ensuring a balanced distribution across all categories.
Hardware Specification	Yes	Training was performed on a single NVIDIA A100 GPU for 1,000,000 iterations, which corresponds to approximately 3 5 days of runtime depending on the dataset.
Software Dependencies	No	We adopted the Variance Exploding (VE) configuration for sampling, following the hyperparameter settings provided in the official codebase of [76]. The hyperparameters we tuned are summarized in Table 6.
Experiment Setup	Yes	We adopted the Variance Exploding (VE) configuration for sampling, following the hyperparameter settings provided in the official codebase of [76]. The hyperparameters we tuned are summarized in Table 6. Our search focused on variations in batch size, learning rate, and latent space dimensionality. All loss terms were used without weighting adjustments. Training was performed on a single NVIDIA A100 GPU for 1,000,000 iterations, which corresponds to approximately 3 5 days of runtime depending on the dataset. We used the RAdam optimizer.