On the Convergence of Black-Box Variational Inference
Authors: Kyurae Kim, Jisu Oh, Kaiwen Wu, Yian Ma, Jacob Gardner
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate this theoretical insight by comparing proximal SGD against other standard implementations of BBVI on large-scale Bayesian inference problems. In Section 5, we evaluate the utility of proximal SGD on large-scale Bayesian inference problems. |
| Researcher Affiliation | Academia | Kyurae Kim University of Pennsylvania kyrkim@seas.upenn.edu Jisu Oh North Carolina State University joh26@ncsu.edu Kaiwen Wu University of Pennsylvania kaiwenwu@seas.upenn.edu Yi-An Ma University of California, San Diego yianma@ucsd.edu Jacob R. Gardner University of Pennsylvania jacobrg@seas.upenn.edu |
| Pseudocode | Yes | Algorithm 1: Prox Gen-Adam for Black-Box Variational Inference |
| Open Source Code | No | The paper does not provide an explicit statement or link to its own open-source code for the methodology described. |
| Open Datasets | Yes | LME-election Linear Mixed Effects 1988 U.S. presidential election (Gelman & Hill, 2007); KEGG-undirected (Shannon et al., 2003); million songs (Bertin-Mahieux et al., 2011); The dataset was obtained from Posterior DB (Magnusson et al., 2022). |
| Dataset Splits | No | The paper mentions batch sizes and Monte Carlo samples, but does not provide specific training/validation/test dataset splits (e.g., percentages or sample counts for each split). |
| Hardware Specification | Yes | Table 1: Computational Resources: System Topology 2 nodes with 2 sockets each with 24 logical threads (total 48 threads) Processor 1 Intel Xeon Silver 4310, 2.1 GHz (maximum 3.3 GHz) per socket Cache 1.1 Mi B L1, 30 Mi B L2, and 36 Mi B L3 Memory 250 Gi B RAM Accelerator 1 NVIDIA RTX A5000 per node, 2 GHZ, 24GB RAM |
| Software Dependencies | No | The paper mentions 'Turing (Ge et al., 2018)' and 'Adam (Kingma & Ba, 2015)' but does not provide specific version numbers for these or other software dependencies used in the experiments. |
| Experiment Setup | Yes | We run all algorithms with a fixed stepsize... We implement doubly stochastic subsampling (Titsias & Lázaro-Gredilla, 2014) with a batch size of B= 100 (B= 500 for BT-tennis) with M= 10 Monte Carlo samples. ... The results shown used a base stepsize of γ= 10^3, while the initial point was m0 = 0, C0 = I. |