Latent Space Oddity: on the Curvature of Deep Generative Models

Authors: Georgios Arvanitidis, Lars Kai Hansen, Søren Hauberg

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the usefulness of the geometric view of the latent space with several experiments. Model and implementation details can be found in Appendix D. In all experiments we first train a VAE and then use the induced Riemannian metric.
Researcher Affiliation Academia Georgios Arvanitidis, Lars Kai Hansen, Søren Hauberg Technical University of Denmark, Section for Cognitive Systems {gear,lkai,sohau}@dtu.dk
Pseudocode Yes Algorithm 1 The training of a VAE that ensures geometry ... Algorithm 2 Brownian motion on a Riemannian manifold
Open Source Code No The paper does not provide an explicit statement about releasing code or a link to a code repository for the methodology described.
Open Datasets Yes We demonstrate the usefulness of the geometric view of the latent space with several experiments. ... We construct 3 sets of MNIST digits, using 1000 random samples for each digit. ... First, we train a VAE for the digits 0 and 1 from MNIST. ... We trained a VAE on the digits 0 and 1 of the MNIST scaled to [ -1, 1]. We randomly split the data to 90% training and 10% test data, ensuring balanced classes.
Dataset Splits No The paper mentions 90% training and 10% test data split but does not explicitly mention a validation split percentage or how it was handled.
Hardware Specification Yes We gratefully acknowledge the support of the NVIDIA Corporation with the donation of the used Titan Xp GPU.
Software Dependencies No The paper mentions 'bvp5c from Matlab' and 'k-means' but does not specify versions for any software components.
Experiment Setup Yes We used L2 regularization with parameter equal to 1e-5. ... Encoder/Decoder Layer 1 Layer 2 Layer 3 µφ 64, (tanh) 32, (tanh) d, (linear) σφ 64, (tanh) 32, (tanh) d, (softplus) µθ 32, (tanh) 64, (tanh) D, (sigmoid) ... For the encoder, the mean and the variance functions share the weights of the Layer 1. The input space dimension D = 784. ... Encoder Layer 1 (Conv) Layer 2 (Conv) Layer 3 (MLP) Layer 4 (MLP) µφ 32, 3, 2, (tanh) 32, 3, 2, (tanh) 1024, (tanh) d, (linear) σφ 32, 3, 2, (tanh) 32, 3, 2, (tanh) 1024, (tanh) d, (softplus) ... For the convolutional and deconvolutional layers, the first number is the number of applied filters, the second is the kernel size, and third is the stride. Also, for the encoder, the mean and the variance functions share the convolutional layers. We used L2 regularization with parameter equal to 1e-5.