Stochastic Interpolants with Data-Dependent Couplings

Authors: Michael Samuel Albergo, Mark Goldstein, Nicholas Matthew Boffi, Rajesh Ranganath, Eric Vanden-Eijnden

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 4, we apply the framework to numerical experiments on Image Net, focusing on image inpainting and image super-resolution. and Table 2: FID for Inpainting Task. FID comparison between under two paradigms: a baseline, where ρ0 is a Gaussian with independent coupling to ρ1, and our data-dependent coupling detailed in Section 4.1.
Researcher Affiliation Academia 1Center for Cosmology and Particle Physics, New York University 2Courant Institute of Mathematical Sciences, New York University 3Center for Data Science, New York University.
Pseudocode Yes Algorithm 1 Training and Algorithm 2 Sampling (via forward Euler method)
Open Source Code Yes The code is available at https://github.com/interpolants/couplings.
Open Datasets Yes In our experiments, we set ρ1(x1) to correspond to Image Net (either 256 or 512).
Dataset Splits Yes In our experiments, we set ρ1 to correspond to Image Net (256 or 512), following prior work (Saharia et al., 2022; Ho et al., 2022a). and Table 3: FID-50k for Super-resolution, 64x64 to 256x256. FIDs for baselines taken from (Saharia et al., 2022; Ho et al., 2022a; Liu et al., 2023a). Model Train Valid
Hardware Specification No No specific hardware details (like GPU/CPU models, memory, or cloud instance specifications) were provided for the experimental setup.
Software Dependencies No The paper mentions using PyTorch, Lightning Fabric, Adam optimizer, U-net implementation, and the torchdiffeq library, but no specific version numbers are provided for these software components.
Experiment Setup Yes Additional specific experimental details may be found in Appendix B. [...] We use the following hyperparameters: Dim Mults: (1,1,2,3,4) Dim (channels): 256 Resnet block groups: 8 Leanred Sinusoidal Cond: True Learned Sinusoidal Dim: 32 Attention Dim Head: 64 Attention Heads: 4 Random Fourier Features: False. [...] We use Adam optimizer (Kingma & Ba, 2014), starting at learning rate 2e-4 with the Step LR scheduler which scales the learning rate by γ = .99 every N = 1000 steps. We use no weight decay. We clip gradient norms at 10, 000.