Wasserstein Auto-Encoders

Authors: Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, Bernhard Schoelkopf

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that WAE shares many of the properties of VAEs (stable training, encoder-decoder architecture, nice latent manifold structure) while generating samples of better quality, as measured by the FID score.
Researcher Affiliation Collaboration Ilya Tolstikhin MPI for Intelligent Systems T ubingen, Germany ilya@tue.mpg.de Olivier Bousquet Google Brain Z urich, Switzerland obousquet@google.com Sylvain Gelly Google Brain Z urich, Switzerland sylvaingelly@google.com Bernhard Sch olkopf MPI for Intelligent Systems T ubingen, Germany bs@tue.mpg.de
Pseudocode Yes ALGORITHM 1 Wasserstein Auto-Encoder with GAN-based penalty (WAE-GAN). ALGORITHM 2 Wasserstein Auto-Encoder with MMD-based penalty (WAE-MMD).
Open Source Code Yes The code is available at github.com/tolstikhin/wae.
Open Datasets Yes We trained WAE-GAN and WAE-MMD (Algorithms 1 and 2) on two real-world datasets: MNIST (Le Cun et al., 1998) consisting of 70k images and Celeb A (Liu et al., 2015) containing roughly 203k images.
Dataset Splits No The paper does not provide explicit training/test/validation dataset splits (e.g., specific percentages or sample counts for each split) nor does it reference predefined splits with citations for validation. It mentions 'training set' and 'test set' but no distinct validation split.
Hardware Specification No The paper describes the model architectures (e.g., 'convolutional deep neural network architectures') and training settings (e.g., 'Adam optimizer'), but it does not specify any particular hardware components such as GPU models (e.g., 'NVIDIA A100'), CPU models, or cloud computing instances used for running the experiments.
Software Dependencies No The paper mentions optimizers and architectural styles like 'Adam (Kingma & Lei, 2014)' and 'DCGAN ones reported by Radford et al. (2016)', but it does not list specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'Python 3.8').
Experiment Setup Yes In all reported experiments we used Euclidian latent spaces Z = Rdz for various dz depending on the complexity of the dataset, isotropic Gaussian prior distributions PZ(Z) = N(Z; 0, σ2 z Id) over Z, and a squared cost function c(x, y) = x y 2 2 for data points x, y X = Rdx. We used deterministic encoder-decoder pairs, Adam (Kingma & Lei, 2014) with β1 = 0.5, β2 = 0.999, and convolutional deep neural network architectures... We tried various values of λ and noticed that λ = 10 seems to work good across all datasets we considered. We use dz = 8 for MNIST and dz = 64 for Celeb A... We use mini-batches of size 100 and trained the models for 100 epochs... We used λ = 10 and σ2 z = 1. For the encoder-decoder pair we set α = 10 3 for Adam in the beginning and for the adversary in WAE-GAN to α = 5 10 4. After 30 epochs we decreased both by factor of 2, and after first 50 epochs further by factor of 5.