On Relativistic f-Divergences

Authors: Alexia Jolicoeur-Martineau

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental All experiments were done with the spectral GAN architecture for 32x32 images (Miyato et al., 2018) in Pytorch (Paszke et al., 2017). We used the standard hyperparameters: learning rate (lr) = .0002, batch size (k) = 32, and the ADAM optimizer (Kingma & Ba, 2014) with parameters (α1, α2) = (.50, .999). We trained the models for 100k iterations with one critic update per generator update. For the datasets, we used CIFAR-10 (50k training images from 10 categories) (Krizhevsky, 2009), Celeb A (200k of face images from celebrities) (Liu et al., 2015) and CAT (10k images of cats) (Zhang et al., 2008). All models were trained using the same seed (seed=1) with a single GPU. To evaluate the quality of generated outputs, we used the Fr´echet Inception Distance (FID) (Heusel et al., 2017).
Researcher Affiliation Academia Mila, Universit´e de Montr´eal .
Pseudocode No The paper contains mathematical formulas and proofs but no structured pseudocode or algorithm blocks.
Open Source Code Yes See code for details; the code to reproduce the experiments is available on https://github.com/AlexiaJM/relativistic-f-divergences.
Open Datasets Yes For the datasets, we used CIFAR-10 (50k training images from 10 categories) (Krizhevsky, 2009), Celeb A (200k of face images from celebrities) (Liu et al., 2015) and CAT (10k images of cats) (Zhang et al., 2008).
Dataset Splits No The paper mentions datasets used but does not explicitly provide specific training/validation/test dataset splits, percentages, or absolute sample counts for each split.
Hardware Specification No All models were trained using the same seed (seed=1) with a single GPU. The mention of 'single GPU' is not specific enough to determine the hardware specification.
Software Dependencies No All experiments were done with the spectral GAN architecture for 32x32 images (Miyato et al., 2018) in Pytorch (Paszke et al., 2017). We used the standard hyperparameters: learning rate (lr) = .0002, batch size (k) = 32, and the ADAM optimizer (Kingma & Ba, 2014) with parameters (α1, α2) = (.50, .999). No specific version numbers for PyTorch or other libraries are provided.
Experiment Setup Yes We used the standard hyperparameters: learning rate (lr) = .0002, batch size (k) = 32, and the ADAM optimizer (Kingma & Ba, 2014) with parameters (α1, α2) = (.50, .999). We trained the models for 100k iterations with one critic update per generator update.