Using Large Ensembles of Control Variates for Variational Inference
Authors: Tomas Geffner, Justin Domke
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Results show that combining a large number of control variates this way significantly improves the convergence of inference over using the typical gradient estimators or a reduced number of control variates. |
| Researcher Affiliation | Academia | Tomas Geffner College of Information and Computer Science University of Massachusetts Amherst, MA 01003 tgeffner@cs.umass.eduJustin Domke College of Information and Computer Science University of Massachusetts Amherst, MA 01003 domke@cs.umass.edu |
| Pseudocode | No | The paper describes algorithmic steps in prose (e.g., the update rule for exponential averages), but does not present a formal pseudocode block or algorithm listing. |
| Open Source Code | No | The paper does not provide any statements about releasing open-source code or links to code repositories for the described methodology. |
| Open Datasets | Yes | We tried several control variates and the combination algorithm on a Bayesian binary logistic regression model with a standard Gaussian prior, using three well known datasets: ionosphere, australian, and sonar. |
| Dataset Splits | No | The paper mentions using "minibatches of size 10" but does not specify explicit training, validation, or test dataset splits in terms of percentages, sample counts, or predefined split references. |
| Hardware Specification | No | The paper does not specify any hardware used for running the experiments (e.g., GPU/CPU models, cloud instances, or detailed computer specifications). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or solvers). |
| Experiment Setup | Yes | We use simple SGD with momentum (β = 0.9) as our optimization algorithm, minibatches of size 10, a decay factor of γ = 0.02 for the exponentially decayed empirical averages, and v0 = 10 3, value based on results obtained for the sensitivity analysis carried out (see Sec. 5.1). |