Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

LABridge: Text–Image Latent Alignment Framework via Mean-Conditioned OU Process

Authors: Huiyang Shao, Xin Xia, Yuxi Ren, XING WANG, Xuefeng Xiao

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on standard text-to-image benchmarks show that LABridge achieves better text image alignment metric and competitive FID scores compared to leading diffusion baselines. We conducted a series of experiments to verify the effectiveness of LABridge under various settings.
Researcher Affiliation	Collaboration	Huiyang Shao1,2 Xin Xia2,* Yuxi Ren2 Xing Wang2 Xuefeng Xiao2, 1Tsinghua University 2Byte Dance Seed
Pseudocode	Yes	The overall training procedure for LABridge consists of two sequential stages (details of training procedure are provided in Algo. 1 in Appendix B. ): The overall inference procedure for LABridge mainly based on Eq. 14 (details of inference procedure are provided in Algo. 2 in Appendix B.).
Open Source Code	No	We will provide the source code and data after the draft is completed.
Open Datasets	Yes	The training data comprises a selection from COYO [Byeon et al., 2022] datasets, following selection criteria in [Lin et al., 2024, Ren et al., 2024]. We evaluated performance on standard benchmarks: COCO [Lin et al., 2014], Image Net [Deng et al., 2009], MJHQ-30K [Li et al., 2024a].
Dataset Splits	Yes	We evaluated performance on standard benchmarks: COCO [Lin et al., 2014], Image Net [Deng et al., 2009], MJHQ-30K [Li et al., 2024a]. We evaluate all models on the COCO validation set [Lin et al., 2014], using two primary metrics: FID [Heusel et al., 2017] and CLIP score [Radford et al., 2021, Hessel et al., 2021]. Specifically, we report FID-10K, where prompts are randomly sampled from the validation set.
Hardware Specification	Yes	All code was performed on 8 A100 GPUs machine.
Software Dependencies	No	No specific software versions for key libraries (e.g., PyTorch, TensorFlow) or other dependencies are provided. The paper mentions using 'NV-Embed-v2' and 'Di T-XL/2 model' but without explicit version numbers for the software framework.
Experiment Setup	Yes	The training utilized Adam W optimizer with a learning rate of 1e 5, β1 = β2 = 0.9, weight decay of 0.03, batch size of 16, and run for 200 epochs. All images are preprocessed with center crop and resized (1024 1024). We tune the weighting wa, ws and wr in the range of [0, 1]. After a brief sweep, we used wa = 1.0, ws = 0.5, and wr = 0.2.