Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference

Authors: Kyurae Kim, Kaiwen Wu, Jisu Oh, Jacob R. Gardner

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4. Simulations. We now evaluate our bounds and the insights gathered during the analysis through simulations. We implemented a bare-bones implementation of BBVI in Julia (Bezanson et al., 2017) with plain SGD.
Researcher Affiliation Academia 1Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, United States 2Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States.
Pseudocode No The paper provides theoretical proofs, lemmas, and theorems, but it does not include any section or figure explicitly labeled as "Pseudocode" or "Algorithm", nor does it present any structured code-like blocks.
Open Source Code No The paper mentions implementing BBVI in Julia for simulations ("We implemented a bare-bones implementation of BBVI in Julia (Bezanson et al., 2017) with plain SGD.") but does not provide any explicit statement or link regarding the open-sourcing of this code.
Open Datasets Yes We now evaluate the theoretical results with real datasets. Given a regression dataset (𝑿, 𝒚), we use the linear Gaussian model... The constants are 𝐿H = 3.520 104, 𝜇KL = 2.909 103. Due to poor conditioning, the bound is much looser compared to the quadratic case. We note that generalizing our bounds to utilize matrix smoothness and matrix-quadratic growth as done by (Domke, 2019) would tighten the bounds. But the theoretical gains would be marginal. Detailed information about the datasets and additional results for other parameterizations can be found in Appendix B.2. and "AIRFOIL dataset (Dua & Graff, 2017)". Also Table 2 lists datasets with their properties.
Dataset Splits No The paper describes the datasets used (e.g., AIRFOIL, FERTILITY, PENDULUM, WINE) and the number of Monte Carlo samples (𝑀= 10), but it does not provide specific details on how these datasets were split into training, validation, or test sets for the experiments.
Hardware Specification No The paper mentions running simulations ("We implemented a bare-bones implementation of BBVI in Julia...") but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for these experiments.
Software Dependencies No The paper states that the implementation was done in "Julia (Bezanson et al., 2017)", but it does not specify the version number of Julia or any other software dependencies, libraries, or solvers with their respective version numbers.
Experiment Setup Yes The stepsize were manually tuned so that all problems converge at similar speeds. For all problems, we use a unit Gaussian base distribution such that 𝜑(𝑢) = 𝒩(𝑢; 0, 1) resulting in a kurtosis of 𝜅= 3 and use 𝑀= 10 Monte Carlo samples. ... We set the constants as 𝜎= 0.3, 𝜆= 8.0, and 𝑁= 100, the mode 𝒛 is randomly sampled from a Gaussian, and the dimension of the problem is 𝑑= 20. For the bounded entropy case, we set 𝑆= 2.0 (the true standard deviation is in the order of 1e-3).