reproducibilityindex.ai

Advances in Black-Box VI: Normalizing Flows, Importance Weighting, and Optimization

Authors: Abhinav Agrawal, Daniel R. Sheldon, Justin Domke

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate components relating to optimization, ﬂows, and Monte-Carlo methods on a benchmark of 30 models from the Stan model library.
Researcher Affiliation	Academia	1College of Information and Computer Sciences, University of Massachusetts Amherst 2Department of Computer Science, Mount Holyoke College {aagrawal, sheldon, domke}@cs.umass.edu
Pseudocode	No	The paper describes algorithms and procedures in text but does not include any formally labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using existing tools like Autograd and Stan but does not provide a link or explicit statement about releasing the source code for their own methodology or implementation.
Open Datasets	Yes	We evaluate each method using a benchmark of 30 models from the Stan Model library [35, 36].
Dataset Splits	No	The paper discusses evaluation using 10,000 fresh samples and mentions ADVI's step-size selection based on ELBO after 200 iterations, but it does not specify a distinct validation set with percentages or counts for hyperparameter tuning or model selection in a general sense separate from the final evaluation.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models or memory amounts used for running experiments.
Software Dependencies	No	The paper mentions software like "Autograd, a Python automatic differentiation library [22]" and "Stan, a state-of-the-art probabilistic programming framework [4]" but does not specify their version numbers or other required software dependencies with versions.
Experiment Setup	Yes	During optimization, all methods have the same computational budget, measured as 100 "oracle evaluations" of the log p per iteration, and are optimized for 30,000 iterations.