Robust, Accurate Stochastic Optimization for Variational Inference

Authors: Akash Kumar Dhaka, Alejandro Catalina, Michael R. Andersen, Måns Magnusson, Jonathan Huggins, Aki Vehtari

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show empirically that the proposed framework works well on a diverse set of models... We now turn to validating our robust stochastic optimization algorithm for variational inference (summarized in Algorithm 1) through experiments on both simulated and real-world data.
Researcher Affiliation Academia Akash Kumar Dhaka Aalto University akash.dhaka@aalto.fi Alejandro Catalina Aalto University alejandro.catalina@aalto.fi Michael Riis Andersen Technical University of Denmark miri@dtu.dk Måns Magnusson Uppsala University mans.magnusson@statistik.uu.se Jonathan H. Huggins Boston University huggins@bu.edu Aki Vehtari Aalto University aki.vehtari@aalto.fi
Pseudocode Yes Algorithm 1: Robust Stochastic Optimization for Variational Inference
Open Source Code No The paper mentions using third-party tools like viabel [23], TensorFlow Probability [9], Stan [4], and arviz [29], but does not state that the authors' own implementation code is open-source or provide a link to it.
Open Datasets Yes logistic regression [61] on three UCI datasets (Boston, Wine, and Concrete [10]); a high-dimensional hierarchical Gaussian model (Radon [34]), the 8-school hierarchical model [49], and a Bayesian neural network model... on the MNIST dataset [31]
Dataset Splits No The paper uses various datasets but does not explicitly provide details about specific train, validation, or test dataset splits (e.g., percentages, sample counts, or explicit standard split references) needed for reproduction.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory, or cloud instances) used to run the experiments.
Software Dependencies No The paper mentions using 'viabel', 'TensorFlow Probability', 'Stan', and 'arviz' but does not provide specific version numbers for these software components.
Experiment Setup Yes In our experiments we used η = 0.01, W = 100, a = 0.5, τ = 1.2, and e = 20. ... we used J = 1 in all of our experiments; the exception is that Fig. 2 used J = 4 ... We also put ELBO at an advantage by doing some tuning of the threshold ϵ, while keeping ϵ = 0.02 when using our MCSE criterion.