Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Multitask Learning with Stochastic Interpolants

Authors: Hugo Negrel, Florentin Coeurdoux, Michael Albergo, Eric Vanden-Eijnden

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the zero-shot efficacy of our method on conditional generation and inpainting, fine-tuning and posterior sampling, and multiscale modeling, suggesting its potential as a generic task-agnostic alternative to specialized models. 5 Numerical experiments Below we provide numerical realization of some of the various objectives that can be fulfilled with the multitask objective.
Researcher Affiliation	Collaboration	Hugo Negrel Capital Fund Management 23 Rue de l Université, 75007 Paris EMAIL Florentin Coeurdoux Capital Fund Management 23 Rue de l Université, 75007 Paris EMAIL Michael S Albergo Society of Fellows, Harvard University EMAIL Eric Vanden-Eijnden Machine Learning Lab Capital Fund Management 23 Rue de l Université, 75007 Paris EMAIL
Pseudocode	Yes	Algorithm 1: Multitask learner input: Samples (x0, x1) µ; choice of distribution ν(dα, dβ) and associated sampler. repeat Draw batch (xi 0, x1 i , αi, βi)M i=1 µ ν. Compute Ii = αixi 0 + βixi 1. Compute ˆL = 1 M PM i=1 ˆη0(αi, βi, Ii) xi 0 2 + ˆη1(αi, βi, Ii) xi 1 2. Take a gradient step on ˆL to update ˆη0 and ˆη1. until converged; output: Drifts ˆη0 and ˆη1. Algorithm 2: Multitask generator input: Drifts ˆη0, ˆη1; choice of path (αtβt)t [0,1] tailored to the generation task; data I(α0, β0) = α0x0 + β0x1; diffusion coefficient ϵt 0; time step h = 1/K with K N. initialize: ˆXϵ 0 = I(α0, β0); for k = 0, . . . , K 1 do set ˆηk 0 = ˆη0(αkh, βkh, ˆXϵ kh), ˆηk 1 = ˆη1(αkk, βkh, ˆXϵ kh), and zk N(0, Id) update ˆXϵ k+1k = ˆXϵ kh + h αkk ϵkhα 1 kk ˆηk 0 + h βkhˆηk 1 + 2ϵkhh zk, end output: ˆXϵ 1 d= I(α1, β1) (approximately)
Open Source Code	Yes	All the code and data used for the numerical experiments will be exposed in a Github repository. A set of instructions to fully reproduce the results will be provided in a README file. (From Neur IPS Paper Checklist, Q5)
Open Datasets	Yes	We evaluate our method on three datasets: MNIST, with images of size 28 28, Celeb A, resized to 128 128, and of Animal Faces HQ focused on cat class, with images resized to 256 256. In our paper, we properly credit the creators and original owners of all assets used, including the MNIST dataset and any referenced algorithms or methodologies. For the MNIST dataset, which is in the public domain, we acknowledge its source and cite the original publication. All assets are used in accordance with their intended research purposes and we have carefully respected all applicable terms of use and licensing requirements throughout our research process. (From Neur IPS Paper Checklist, Q12 justification)
Dataset Splits	Yes	We evaluate our method on three datasets: MNIST, with images of size 28 28, Celeb A, resized to 128 128, and of Animal Faces HQ focused on cat class, with images resized to 256 256. We present benchmark results for all methods across various image restoration tasks, evaluating the average peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) on 100 test images from each dataset: AFHQ-Cat (256 256) and Celeb A (128 128). Table 2: Hyperparameters and architecture for MNIST, ϕ4 and maze datasets. ... # Training point 60,000 100,000 190,000 5,000
Hardware Specification	Yes	Table 2: Hyperparameters and architecture for MNIST, ϕ4 and maze datasets. ... # GPUs 1 1 4 4. A Table containing all relevant information will be provided in the Appendix. (From Neur IPS Paper Checklist, Q8 justification)
Software Dependencies	No	For all image generation experiments, the U-Net architecture originally proposed in Ho et al. (2020) is used. ... trained with an Adam optimizer Kingma and Ba (2017). (No specific software versions provided for frameworks or libraries)
Experiment Setup	Yes	Details of the experimental setup can be found in Appendix B. ... For all image generation experiments, the U-Net architecture originally proposed in Ho et al. (2020) is used. The specification of architecture hyperparameters as well as training hyperparameters are given in Table 2. Training was done for 200 epochs on batches comprised of 30 draws from the target, and 50 time slices. The objectives given in 3 and 4 were optimized using the Adam optimizer. The learning rate was set to .0001 and was dropped by a factor of 2 every 1500 iterations of training. To integrate the ODE/SDE when drawing samples, we used a simple Euler integrator. (And Table 2 provides further details on Batch Size, Training Steps, LR, etc.)