reproducibilityindex.ai

Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer

Authors: David Berthelot*, Colin Raffel*, Aurko Roy, Ian Goodfellow

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we propose a regularization procedure which encourages interpolated outputs to appear more realistic by fooling a critic network which has been trained to recover the mixing coefﬁcient from interpolated data. We then develop a simple benchmark task where we can quantitatively measure the extent to which various autoencoders can interpolate and show that our regularizer dramatically improves interpolation in this setting. We also demonstrate empirically that our regularizer produces latent codes which are more effective on downstream tasks, suggesting a possible link between interpolation abilities and learning useful representations.
Researcher Affiliation	Industry	David Berthelot Google Brain dberth@google.com Colin Raffel Google Brain craffel@gmail.com Aurko Roy Google Brain aurkor@google.com Ian Goodfellow Google Brain goodfellow@google.com
Pseudocode	No	The paper describes algorithms but does not provide pseudocode or algorithm blocks.
Open Source Code	Yes	We also make our codebase available1 which provides a uniﬁed implementation of many common autoencoders including our proposed regularizer. 1https://github.com/anonymous-iclr-2019/acai-iclr-2019
Open Datasets	Yes	On any dataset, our desiderata for a successful interpolation are that intermediate points look realistic and provide a semantically meaningful morphing between its endpoints. On this synthetic lines dataset, we can formalize these notions as speciﬁc evaluation metrics, which we describe in detail in appendix A.2. To summarize, we propose two metrics: Mean Distance and Smoothness. Mean Distance measures the average distance between interpolated points and real datapoints. Smoothness measures whether the angles of the interpolated lines follow a linear trajectory between the angle of the start and endpoint. Both of these metrics are simple to deﬁne due to our construction of a dataset where we exactly know the data distribution and manifold; we provide a full deﬁnition and justiﬁcation in appendix A.2. A perfect alignment would achieve 0 for both scores; larger values indicate a failure to generate realistic interpolated points or produce a smooth interpolation respectively. By choosing a synthetic benchmark where we can explicitly measure the quality of an interpolation, we can conﬁdently evaluate different autoencoders on their interpolation abilities. ... Table 2: Single-layer classiﬁer accuracy achieved by different autoencoders. Dataset dz Baseline Denoising VAE AAE VQ-VAE ACAI MNIST SVHN CIFAR-10
Dataset Splits	No	The paper mentions training and testing but does not explicitly detail a validation split or how it was used in the main text.
Hardware Specification	No	No specific hardware details (like GPU models, CPU types, or memory) are provided.
Software Dependencies	No	The paper mentions Adam optimizer but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	All parameters are initialized as zero-mean Gaussian random variables with a standard deviation of 1/ fan_in(1+0.22) set in accordance with the leaky Re LU slope of 0.2. Models are trained on 224 samples in batches of size 64. Parameters are optimized with Adam (Kingma & Welling, 2014) with a learning rate of 0.0001 and default values for β1, β2, and ϵ.