Consistency Regularization for Variational Auto-Encoders
Authors: Samarth Sinha, Adji Bousso Dieng
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments (see Section 4), we apply the proposed technique to four vae variants, the original vae (Kingma & Welling, 2013), the importance-weighted auto-encoder (iwae) (Burda et al., 2015), the β-vae (Higgins et al., 2017), and the nouveau variational auto-encoder (nvae) (Vahdat & Kautz, 2020). We found, on four different benchmark datasets, that cr-vaes always yield better representations and generalize better than their base vaes. |
| Researcher Affiliation | Collaboration | Samarth Sinha Vector Institute University of Toronto Adji B. Dieng Google Brain Princeton University |
| Pseudocode | Yes | Algorithm 1: Consistency Regularization for Variational Autoencoders |
| Open Source Code | Yes | 1Code for this work can be found at https://github.com/sinhasam/CRVAE |
| Open Datasets | Yes | We first consider mnist. mnist is a handwritten digit recognition dataset with 60, 000 images in the training set and 10, 000 images in the test set (Le Cun, 1998). ... We also consider omniglot, a handwritten alphabet recognition dataset (Lake et al., 2011). ... Finally we consider celeba. It is a dataset of faces, consisting of 162, 770 images for training, 19, 867 images for validation, and 19, 962 images for testing (Liu et al., 2018). |
| Dataset Splits | Yes | We form a validation set of 10, 000 images randomly sampled from the training set. ... We use 16, 280 randomly sampled images for training and 1, 000 for validation and the remaining 2, 000 samples for testing. ... It is a dataset of faces, consisting of 162, 770 images for training, 19, 867 images for validation, and 19, 962 images for testing (Liu et al., 2018). |
| Hardware Specification | Yes | All experiments were done on a GPU cluster consisting of Nvidia P100 and RTX. |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' but does not specify versions for any programming languages, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The networks are trained with the Adam optimizer with a learning rate of 10 4 (Kingma & Ba, 2014) and trained for 100 epochs with a batch size of 64. We set the dimensionality of the latent variables to 50, therefore the maximum number of active latent units in the latent space is 50. We found λ = 0.1 to be best according to cross-validation using held-out log-likelihood and exploring the range [1e 4, 1.0] datasets. In an ablation study we explore λ = 0. For the β-vae we set λ = 0.1 β and study both β = 0.1 and β = 10, two regimes under which the β-vae performs qualitatively very differently (Higgins et al., 2017). |