Meta-Amortized Variational Inference and Learning

Authors: Mike Wu, Kristy Choi, Noah Goodman, Stefano Ermon6404-6412

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically demonstrate the effectiveness of our method by introducing the Meta VAE, and show that it significantly outperforms baselines on downstream image classification tasks on MNIST (10-50%) and NORB (10-35%).
Researcher Affiliation Academia Department of Computer Science and Psychology Stanford University {wumike, kechoi, ngoodman, ermon}@stanford.edu
Pseudocode No The paper describes algorithmic steps and procedures but does not include a formal pseudocode block or an algorithm box.
Open Source Code Yes We provide reference implementations in Py Torch, and the codebase for this work is open-sourced at https://github.com/mhw32/meta-inference-public.
Open Datasets Yes We study MNIST and NORB (Le Cun et al. 2004)
Dataset Splits No The paper describes how transformations are split and how learned feature sets are split for downstream tasks, but it does not provide specific percentages or absolute counts for the main training, validation, and test splits of the datasets (MNIST, NORB) used for training the Meta VAE.
Hardware Specification No The paper does not provide specific details on the hardware used, such as GPU or CPU models, memory, or cloud computing instances.
Software Dependencies No We provide reference implementations in Py Torch (PyTorch is mentioned but no version number is given, and no other specific software dependencies with versions are listed).
Experiment Setup Yes Implementation Details: In practice, for some dataset Di and input x , we implement the meta-inference model gφ(Di, x) = rφ2(CONCAT(x, hφ1(Di)) where φ = {φ1, φ2}. The summary network hφ1( ) is a two-layer perceptron (MLP) that ingests each element in Di independently and computes a summary representation using the mean. The aggregation network rφ2( ) is a second two layer MLP that takes as input the concatenated summary and input. The corresponding i-th generative model pθi(x|z) is parameterized by an MLP with identical architecture as rφ2( ). Re LU nonlinearities were used between layers. For more complex image domains (such as NORB), we use three-layer convolutional networks instead of MLPs.