Schrodinger Bridge Flow for Unpaired Data Translation

Authors: Valentin De Bortoli, Iryna Korshunova, Andriy Mnih, Arnaud Doucet

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the performance of our algorithm on a variety of unpaired data translation tasks.
Researcher Affiliation Industry Valentin De Bortoli Google Deep Mind Iryna Korshunova Google Deep Mind Andriy Mnih Google Deep Mind Arnaud Doucet Google Deep Mind
Pseudocode Yes Algorithm 1 α-Diffusion Schrödinger Bridge Matching
Open Source Code No Due to IP restrictions, we cannot share the codebase used for this paper. However, we plan to release some notebooks in order to reproduce experiments in a small scale setting.
Open Datasets Yes We closely follow the setup of Shi et al. (2023) and De Bortoli et al. (2021), and train the models to transfer between 10 EMNIST letters, A-E and a-e, and 10 MNIST digits (CC BY-ND 4.0 license). [...] We consider the problem of image translation between Cat and Wild domains of AFHQ (Choi et al. (2020); CC BY-NC 4.0 DEED licence).
Dataset Splits Yes For DSBM finetuning, we perform 30 outer iterations, i.e. alternating between training the forward and the backward networks, while at each outer iteration a network is trained for 5000 steps. [...] For evaluation, we compute FID based on the whole MNIST training set of 60000 examples and a set of 4000 samples that were initialised from each test image in the EMNIST dataset.
Hardware Specification Yes Pretraining a bidirectional model on 4 v3 TPUs takes 1 hour, while the online finetuning stage requires 4 hours on 16 v3 TPUs.
Software Dependencies No To optimise our networks, we use Adam (Kingma and Ba, 2015) with β = (0.9, 0.999)... The paper mentions Adam optimizer with its parameters but does not specify versions for programming languages, libraries (e.g., PyTorch, TensorFlow), or other key software components.
Experiment Setup Yes For every model used in the paper, we provide hyperparameters in Table 3. [...] To optimise our networks, we use Adam (Kingma and Ba, 2015) with β = (0.9, 0.999), and we modify the gradients to keep their global norm below 1.0. We re-initialise the optimiser s state when the finetuning phase starts.