Distribution Augmentation for Generative Modeling

Authors: Heewoo Jun, Rewon Child, Mark Chen, John Schulman, Aditya Ramesh, Alec Radford, Ilya Sutskever

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate this is a more effective regularizer than standard methods, and use it to train a 152M parameter autoregressive model on CIFAR-10 to 2.56 bits per dim (relative to the state-of-the-art 2.80). Samples from this model attain FID 12.75 and IS 8.40, outperforming the majority of GANs.
Researcher Affiliation Industry 1Open AI, San Francisco, California, USA. Correspondence to: Heewoo Jun <heewoo@openai.com>, Rewon Child <rewon@openai.com>.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes We release our model weights and code at https://github.com/openai/distribution_augmentation.
Open Datasets Yes In this section, we primarily study an autoregressive model (the Sparse Transformer (Child et al., 2019)) and its performance on the natural image benchmark datasets CIFAR-10 and Image Net-64.
Dataset Splits Yes CIFAR-10 validation bits per dim of different augmentation strategies across model sizes (in millions of parameters). Baseline and horizontal flipping do not use Dist Aug. (Figure 3a)
Hardware Specification No The paper does not specify the hardware used for the experiments (e.g., GPU models, CPU types, memory).
Software Dependencies No The paper mentions using existing codebases like those from Salimans et al. (2017) and Ho et al. (2019), but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Detailed hyperparameter settings for experiments are available in the Supplementary Material. (Section 4) For CIFAR-10, 58M and 152M models use the same hyperparameters as the ones in (Child et al., 2019) except they are trained with a learning rate of 0.00015 for 1000 1500 epochs with a cosine decay (Radford et al., 2018) over 10000 epochs. Batch size for all CIFAR-10 experiments was 16. (A.1)