Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport

Authors: Jaemoo Choi, Jaewoong Choi, Myungjoo Kang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate these properties empirically through experiments. Moreover, we study the theoretical upper-bound of divergence between distributions in UOT. Our model outperforms existing OT-based generative models, achieving FID scores of 2.97 on CIFAR-10 and 6.36 on Celeb A-HQ-256.
Researcher Affiliation Academia Jaemoo Choi Seoul National University toony42@snu.ac.kr Jaewoong Choi Korea Institute for Advanced Study chwj1475@kias.re.kr Myungjoo Kang Seoul National University mkang@snu.ac.kr
Pseudocode Yes Algorithm 1 Training algorithm of UOTM
Open Source Code Yes The code is available at https://github.com/Jae-Moo/UOTM.
Open Datasets Yes Toy data For all the Toy dataset experiments, we used the same generator and discriminator architectures. ... CIFAR-10 We utilized all 50,000 samples. ... Celeb A-HQ We used all 120,000 samples.
Dataset Splits No The paper does not explicitly state training, validation, and test splits for the datasets, only the total number of samples used for CIFAR-10 and Celeb A-HQ, and how toy data was composed.
Hardware Specification Yes Score SDE [69] takes more than 70 hours for training CIFAR-10, 48 hours for DDGAN [80], and 35-40 hours for RGM on four Tesla V100 GPUs. OTM takes approximately 30-35 hours to converge, while our model only takes about 25 hours.
Software Dependencies No The paper mentions using "Adam optimizer" and states that the implementation for the large model follows Choi et al. [13], but it does not provide specific version numbers for software libraries or frameworks (e.g., PyTorch, TensorFlow, CUDA).
Experiment Setup Yes For all the Toy dataset experiments, we used the same generator and discriminator architectures. The dimension of the auxiliary variable z is set to one. For a generator, we passed z through two fully connected (FC) layers with a hidden dimension of 128, resulting in 128-dimensional embedding. ... We used a batch size of 256, a learning rate of 10^-4, and 2000 epochs. ... For the small model setting of UOTM, we employed the architecture of Balaji et al. [8]. ... We set a batch size of 128, 200 epochs, a learning rate of 2 x 10^-4 and 10^-4 for the generator and discriminator, respectively. Adam optimizer with β1 = 0.5, β2 = 0.9 is employed. Moreover, we use R1 regularization of λ = 0.2.