Constraining Variational Inference with Geometric Jensen-Shannon Divergence

Authors: Jacob Deasy, Nikola Simidjievski, Pietro Lió

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that skewing our variant of JSG , in the context of JSG -VAEs, leads to better reconstruction and generation when compared to several baseline VAEs.
Researcher Affiliation Academia Department of Computer Science and Technology University of Cambridge
Pseudocode No The paper describes its methods in text and mathematical formulas but does not include any pseudocode blocks or formally labeled algorithms.
Open Source Code Yes Code is available at: https://github.com/jacobdeasy/geometric-js
Open Datasets Yes We evaluate the reconstruction loss (mean squared error) on four standard benchmark datasets: MNIST, 28x28 black and white images of handwritten digits [21]; Fashion MNIST, 28x28 black and white images of clothing [36]; Chairs, 64x64 black and white images of 3D chairs [1]; d Sprites 64x64 black and white images of 2D shapes procedurally generated from 6 ground truth independent latent factors [25].
Dataset Splits No The paper states it follows "standard experimental protocols" and evaluates on train and test sets, but does not provide specific percentages or counts for training, validation, and test splits needed to reproduce the data partitioning.
Hardware Specification No The paper states that "Experiments were performed using PyTorch [33]" but does not specify any particular hardware components such as CPU, GPU models, or memory.
Software Dependencies No The paper mentions "Experiments were performed using PyTorch [33]" but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes For MNIST and Fashion-MNIST, we use a 2-layer convolutional encoder/decoder with 32 and 64 4x4 filters with stride 2 and ReLU activations, followed by a fully connected layer (256 units) and the latent layer (10 dimensions). For d Sprites and Chairs, we use a 3-layer convolutional encoder/decoder with 32, 32, and 64 4x4 filters with stride 2 and ReLU activations, followed by a fully connected layer (256 units) and the latent layer (10 dimensions). We use Adam optimisation [16] with a learning rate of 0.0001, and batch size 64 for 500 epochs.