Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching

Authors: Chen Chen, Pengsheng Guo, Liangchen Song, Jiasen Lu, Rui Qian, Tsu-Jui Fu, Xinze Wang, Wei Liu, Yinfei Yang, Alex Schwing

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On low-dimensional synthetic data, we visualize and quantify the effects of CAR-Flow. On higher-dimensional natural image data (Image Net-256), equipping Si T-XL/2 with CAR-Flow reduces FID from 2.07 to 1.68, while introducing less than 0.6% additional parameters.
Researcher Affiliation	Industry	Work done while at Apple.
Pseudocode	Yes	Algorithm 1 Sampling via conditional flow matching (standard Gaussian source) and Algorithm 2 Sampling via conditional flow matching (reparameterized)
Open Source Code	No	All experiments are done using the axlearn framework.2 and Our baseline is Si T-XL/2 [Ma et al., 2024], re-implemented in JAX [Bradbury et al., 2018]; we strictly follow the original training recipe from the open-source Si T repository to replicate the results reported in the paper.
Open Datasets	Yes	To benchmark on a high-dimensional, large-scale dataset, we conduct experiments on Image Net 256 256 data using v6e-256 TPUs. and To assess generalization beyond Image Net, we trained a Si T-XL/2 baseline and CAR-Flow variants for 400k steps on CIFAR-10 using pixel-space diffusion
Dataset Splits	No	To benchmark on a high-dimensional, large-scale dataset, we conduct experiments on Image Net 256 256 data using v6e-256 TPUs. and To assess generalization beyond Image Net, we trained a Si T-XL/2 baseline and CAR-Flow variants for 400k steps on CIFAR-10 using pixel-space diffusion
Hardware Specification	Yes	All the synthetic data experiments were executed on the CPU cores of an Apple M1 Pro laptop. and Training is performed on a single v6e-256 TPU slice.
Software Dependencies	No	All experiments are done using the axlearn framework. and Our baseline is Si T-XL/2 [Ma et al., 2024], re-implemented in JAX [Bradbury et al., 2018]
Experiment Setup	Yes	We train with a batch size of 1 024 using Adam W with β1 = 0.9 and β2 = 0.95. Learning rates are fine-tuned per parameter group: 1 10 3 for the source shift network, 1 10 4 for the target shift network, and 1 10 5 for all remaining parameters. and All models are sampled using the Heun SDE solver with 250 NFEs. and Both label-conditioning networks are trained with a higher learning rate 1 10 1; all other hyper-parameters are unchanged.