Learning Autoencoders with Relational Regularization

Authors: Hongteng Xu, Dixin Luo, Ricardo Henao, Svati Shah, Lawrence Carin

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our relational regularized autoencoder (RAE) for image-generation tasks and compare it with the following alternatives: the variational autoencoder (VAE) (Kingma & Welling, 2013), the Wasserstein autoencoder (WAE) (Tolstikhin et al., 2018), the sliced Wasserstein autoencoder (SWAE) (Kolouri et al., 2018), the Gaussian mixture VAE (GMVAE) (Dilokthanakul et al., 2016), and the Vamp Prior (Tomczak & Welling, 2018). ... For each dataset, we compare the proposed RAE with the baselines on i) the reconstruction loss on testing samples; ii) the Fr echet Inception Distance (FID) between 10,000 testing samples and 10,000 randomly generated samples. We list the performance of various autoencoders in Table 2.
Researcher Affiliation Collaboration 1Infinia ML Inc., Durham, NC, USA 2Duke University, Durham, NC, USA.
Pseudocode Yes Algorithm 1 Learning RAE with hierarchical FGW
Open Source Code Yes The code is at https://github.com/Hongteng Xu/ Relational-Auto Encoders.
Open Datasets Yes We test the methods on the MNIST (Le Cun et al., 1998) and Celeb A datasets (Liu et al., 2015). For fairness, all the autoencoders have the same DCGAN-style architecture (Radford et al., 2015) and are learned with the same hyperparameters: the learning rate is 0.001; the optimizer is Adam (Kingma & Ba, 2014) with β1 = 0.5 and β2 = 0.999; the number of epochs is 50; the batch size is 100; the weight of regularizer γ is 1; the dimension of latent code is 8 for MNIST and 64 for Celeb A.
Dataset Splits Yes For each dataset, we use 80% of the data for training, 10% for validation, and the remaining 10% for testing.
Hardware Specification Yes We implement all the autoencoders with Py Torch and train them on a single NVIDIA GTX 1080 Ti GPU.
Software Dependencies No The paper mentions 'Py Torch' as the implementation framework but does not specify its version number or any other software dependencies with specific versions.
Experiment Setup Yes For fairness, all the autoencoders have the same DCGAN-style architecture (Radford et al., 2015) and are learned with the same hyperparameters: the learning rate is 0.001; the optimizer is Adam (Kingma & Ba, 2014) with β1 = 0.5 and β2 = 0.999; the number of epochs is 50; the batch size is 100; the weight of regularizer γ is 1; the dimension of latent code is 8 for MNIST and 64 for Celeb A. For the autoencoders with structured priors, we set the number of the Gaussian components to be 10 and initialize their prior distributions at random. For the proposed RAE, the hyperparameter β is set to be 0.1, which empirically makes the Wasserstein term and the GW term in our FGW distance have the same magnitude. The probabilistic RAE calculates the hierarchical FGW based on the proximal gradient method with 20 iterations, and the deterministic RAE calculates the sliced FGW with 50 random projections. All the autoencoders use Euclidean distance as the distance between samples, thus the reconstruction loss is the mean-square-error (MSE).