Moreau-Yosida $f$-divergences

Authors: Dávid Terjék

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental As an application of our results, we propose the Moreau Yosida f-GAN, providing an implementation of the variational formulas for the Kullback-Leibler, reverse Kullback-Leibler, χ2, reverse χ2, squared Hellinger, Jensen-Shannon, Jeffreys, triangular discrimination and total variation divergences as GANs trained on CIFAR-10, leading to competitive results and a simple solution to the problem of uniqueness of the optimal critic.
Researcher Affiliation Academia Alfréd Rényi Institute of Mathematics, Budapest, Hungary.
Pseudocode Yes Algorithm 1 Calculate γφ,ν(f) and fγφ,ν(f)
Open Source Code Yes Source code to reproduce the experiments is available at https://github.com/renyi-ai/moreau-yosida-f-divergences.
Open Datasets Yes As an application of our results, we propose the Moreau Yosida f-GAN, providing an implementation of the variational formulas for the Kullback-Leibler, reverse Kullback-Leibler, χ2, reverse χ2, squared Hellinger, Jensen-Shannon, Jeffreys, triangular discrimination and total variation divergences as GANs trained on CIFAR-10, leading to competitive results and a simple solution to the problem of uniqueness of the optimal critic.
Dataset Splits No The paper mentions training on CIFAR-10 and reports results (IS, FID) which typically involve a test set. However, it does not explicitly specify the training, validation, or test split percentages or methodology beyond stating 'minibatches' were used for training.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The implementation was done in Tensor Flow. However, no specific version number for TensorFlow or any other software dependency is provided.
Experiment Setup Yes Training was done for 100000 iterations, with 5 gradient descent step per iteration for the critic, and 1 for the generator. ... This particular experiment used ℓ= 10 and φ corresponding to the Kullback-Leibler divergence, but we observed identical behavior in other hyperparameter settings as well with a range of α close to 1. ... The implementation was done in Tensor Flow, using the residual critic and generator architectures from Gulrajani et al. (2017).