On the Value of Infinite Gradients in Variational Autoencoder Models

Authors: Bin Dai, Li Wenliang, David Wipf

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Results are displayed in Figure 4(a), where as expected the reconstruction errors are nearly identical, but the learnable γ case leads to much lower MMD values, indicative of a better local solution with reduced under-regularization. We also plot the evolution of the gradient magnitudes d L(θ,φ) dz 2 in Figure 4(b) (other gradients are similar). When γ is learned, the gradient increases slowly; however, with fixed γ = γ , there exists a large gradient right from the start since γ is small but the reconstruction error is high. This contributes to a worse final solution per the results in Figure 4(a).
Researcher Affiliation Collaboration Bin Dai Institue for Advanced Study Tsinghua University daib09physics@hotmail.com Li K. Wenliang Gatsby Computational Neuroscience Unit University College London kevinli@gatsby.ucl.ac.uk David Wipf Shanghai AI Research Lab Amazon Web Services davidwipf@gmail.com
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about open-source code availability or links to code repositories.
Open Datasets Yes Additionally, in the supplementary we demonstrate that indeed, if the inlier data (in this case Fashion MNIST samples) come from a low-dimensional manifold, outlier points (MNIST samples) can be reliably differentiated... To this effect, we first train a VAE model on Celeb A data [Liu et al., 2015] and learn an appropriate small value of γ denoted γ .
Dataset Splits No The paper mentions using Celeb A data for training but does not specify the exact percentages or counts for training, validation, or test splits. It does not provide sufficient detail to reproduce the data partitioning.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup No The paper states that 'network and training details' are in the supplementary, but it does not provide these details (e.g., learning rate, batch size, optimizer) in the main text.