Forward $\chi^2$ Divergence Based Variational Importance Sampling

Authors: Chengrui Li, Yule Wang, Weihan Li, Anqi Wu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply VIS to various popular latent variable models, including mixture models, variational auto-encoders, and partially observable generalized linear models. Results demonstrate that our approach consistently outperforms state-of-the-art baselines, in terms of both log-likelihood and model parameter estimation.
Researcher Affiliation Academia School of Computational Science & Engineering Georgia Institute of Technology, Atlanta, GA 30305, USA
Pseudocode Yes Algorithm 1 VIS 1: for i = 1:N do 2: Sample z(k) K k=1 from q(z|x; ϕ). 3: Update θ by maximizing ln ˆp(x; θ, ϕ) via Eq. 6. 4: Update ϕ by minimizing χ2(p(z|x; θ) q(z|x; ϕ)) via Eq. 12 or Eq. 24. 5: end for
Open Source Code Yes Code: https://github.com/Jerry Soybean/vis.
Open Datasets Yes We apply the VAE model on the MMIST dataset (Le Cun et al., 1998). We run different methods on a real neural spike train recorded from V = 27 retinal ganglion neurons while a mouse is performing a visual test for about 20 mins (Pillow & Scott, 2012).
Dataset Splits No The paper specifies train and test sets but does not explicitly mention a separate validation set for any of the experiments.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used to run the experiments.
Software Dependencies No The paper mentions using "Adam (Kingma & Ba, 2014)" as an optimizer, but it does not specify any software libraries or dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, Python 3.8).
Experiment Setup Yes We use Adam (Kingma & Ba, 2014) as the optimizer and the learning rate is set at 0.002. We run 200 epochs for each method, and in each epoch, 100 batches of size 10 are used for optimization. The number of Monte Carlo samples used for sampling the latent is K = 5000.