Wasserstein Auto-Encoders
Authors: Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, Bernhard Schoelkopf
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that WAE shares many of the properties of VAEs (stable training, encoder-decoder architecture, nice latent manifold structure) while generating samples of better quality, as measured by the FID score. |
| Researcher Affiliation | Collaboration | Ilya Tolstikhin MPI for Intelligent Systems T ubingen, Germany ilya@tue.mpg.de Olivier Bousquet Google Brain Z urich, Switzerland obousquet@google.com Sylvain Gelly Google Brain Z urich, Switzerland sylvaingelly@google.com Bernhard Sch olkopf MPI for Intelligent Systems T ubingen, Germany bs@tue.mpg.de |
| Pseudocode | Yes | ALGORITHM 1 Wasserstein Auto-Encoder with GAN-based penalty (WAE-GAN). ALGORITHM 2 Wasserstein Auto-Encoder with MMD-based penalty (WAE-MMD). |
| Open Source Code | Yes | The code is available at github.com/tolstikhin/wae. |
| Open Datasets | Yes | We trained WAE-GAN and WAE-MMD (Algorithms 1 and 2) on two real-world datasets: MNIST (Le Cun et al., 1998) consisting of 70k images and Celeb A (Liu et al., 2015) containing roughly 203k images. |
| Dataset Splits | No | The paper does not provide explicit training/test/validation dataset splits (e.g., specific percentages or sample counts for each split) nor does it reference predefined splits with citations for validation. It mentions 'training set' and 'test set' but no distinct validation split. |
| Hardware Specification | No | The paper describes the model architectures (e.g., 'convolutional deep neural network architectures') and training settings (e.g., 'Adam optimizer'), but it does not specify any particular hardware components such as GPU models (e.g., 'NVIDIA A100'), CPU models, or cloud computing instances used for running the experiments. |
| Software Dependencies | No | The paper mentions optimizers and architectural styles like 'Adam (Kingma & Lei, 2014)' and 'DCGAN ones reported by Radford et al. (2016)', but it does not list specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'Python 3.8'). |
| Experiment Setup | Yes | In all reported experiments we used Euclidian latent spaces Z = Rdz for various dz depending on the complexity of the dataset, isotropic Gaussian prior distributions PZ(Z) = N(Z; 0, σ2 z Id) over Z, and a squared cost function c(x, y) = x y 2 2 for data points x, y X = Rdx. We used deterministic encoder-decoder pairs, Adam (Kingma & Lei, 2014) with β1 = 0.5, β2 = 0.999, and convolutional deep neural network architectures... We tried various values of λ and noticed that λ = 10 seems to work good across all datasets we considered. We use dz = 8 for MNIST and dz = 64 for Celeb A... We use mini-batches of size 100 and trained the models for 100 epochs... We used λ = 10 and σ2 z = 1. For the encoder-decoder pair we set α = 10 3 for Adam in the beginning and for the adversary in WAE-GAN to α = 5 10 4. After 30 epochs we decreased both by factor of 2, and after first 50 epochs further by factor of 5. |