Using Large Ensembles of Control Variates for Variational Inference

Authors: Tomas Geffner, Justin Domke

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Results show that combining a large number of control variates this way significantly improves the convergence of inference over using the typical gradient estimators or a reduced number of control variates.
Researcher Affiliation Academia Tomas Geffner College of Information and Computer Science University of Massachusetts Amherst, MA 01003 tgeffner@cs.umass.eduJustin Domke College of Information and Computer Science University of Massachusetts Amherst, MA 01003 domke@cs.umass.edu
Pseudocode No The paper describes algorithmic steps in prose (e.g., the update rule for exponential averages), but does not present a formal pseudocode block or algorithm listing.
Open Source Code No The paper does not provide any statements about releasing open-source code or links to code repositories for the described methodology.
Open Datasets Yes We tried several control variates and the combination algorithm on a Bayesian binary logistic regression model with a standard Gaussian prior, using three well known datasets: ionosphere, australian, and sonar.
Dataset Splits No The paper mentions using "minibatches of size 10" but does not specify explicit training, validation, or test dataset splits in terms of percentages, sample counts, or predefined split references.
Hardware Specification No The paper does not specify any hardware used for running the experiments (e.g., GPU/CPU models, cloud instances, or detailed computer specifications).
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or solvers).
Experiment Setup Yes We use simple SGD with momentum (β = 0.9) as our optimization algorithm, minibatches of size 10, a decay factor of γ = 0.02 for the exponentially decayed empirical averages, and v0 = 10 3, value based on results obtained for the sensitivity analysis carried out (see Sec. 5.1).