Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference
Authors: Kyurae Kim, Kaiwen Wu, Jisu Oh, Jacob R. Gardner
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4. Simulations. We now evaluate our bounds and the insights gathered during the analysis through simulations. We implemented a bare-bones implementation of BBVI in Julia (Bezanson et al., 2017) with plain SGD. |
| Researcher Affiliation | Academia | 1Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, United States 2Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States. |
| Pseudocode | No | The paper provides theoretical proofs, lemmas, and theorems, but it does not include any section or figure explicitly labeled as "Pseudocode" or "Algorithm", nor does it present any structured code-like blocks. |
| Open Source Code | No | The paper mentions implementing BBVI in Julia for simulations ("We implemented a bare-bones implementation of BBVI in Julia (Bezanson et al., 2017) with plain SGD.") but does not provide any explicit statement or link regarding the open-sourcing of this code. |
| Open Datasets | Yes | We now evaluate the theoretical results with real datasets. Given a regression dataset (𝑿, 𝒚), we use the linear Gaussian model... The constants are 𝐿H = 3.520 104, 𝜇KL = 2.909 103. Due to poor conditioning, the bound is much looser compared to the quadratic case. We note that generalizing our bounds to utilize matrix smoothness and matrix-quadratic growth as done by (Domke, 2019) would tighten the bounds. But the theoretical gains would be marginal. Detailed information about the datasets and additional results for other parameterizations can be found in Appendix B.2. and "AIRFOIL dataset (Dua & Graff, 2017)". Also Table 2 lists datasets with their properties. |
| Dataset Splits | No | The paper describes the datasets used (e.g., AIRFOIL, FERTILITY, PENDULUM, WINE) and the number of Monte Carlo samples (𝑀= 10), but it does not provide specific details on how these datasets were split into training, validation, or test sets for the experiments. |
| Hardware Specification | No | The paper mentions running simulations ("We implemented a bare-bones implementation of BBVI in Julia...") but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for these experiments. |
| Software Dependencies | No | The paper states that the implementation was done in "Julia (Bezanson et al., 2017)", but it does not specify the version number of Julia or any other software dependencies, libraries, or solvers with their respective version numbers. |
| Experiment Setup | Yes | The stepsize were manually tuned so that all problems converge at similar speeds. For all problems, we use a unit Gaussian base distribution such that 𝜑(𝑢) = 𝒩(𝑢; 0, 1) resulting in a kurtosis of 𝜅= 3 and use 𝑀= 10 Monte Carlo samples. ... We set the constants as 𝜎= 0.3, 𝜆= 8.0, and 𝑁= 100, the mode 𝒛 is randomly sampled from a Gaussian, and the dimension of the problem is 𝑑= 20. For the bounded entropy case, we set 𝑆= 2.0 (the true standard deviation is in the order of 1e-3). |