Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling
Authors: Denis Blessing, Xiaogang Jia, Johannes Esslinger, Francisco Vargas, Gerhard Neumann
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | our work introduces a benchmark that evaluates sampling methods using a standardized task suite and a broad range of performance criteria. Our findings provide insights into strengths and weaknesses of existing sampling methods, serving as a valuable reference for future developments. The code is publicly available here. ... Here, we offer an overview of the evaluation protocol. Next, we present the results obtained for synthetic target densities, followed by those for real targets. |
| Researcher Affiliation | Collaboration | 1Autonomous Learning Robots, Karlsruhe Institute of Technology, Karlsruhe, Germany 2University of Cambridge, Cambridge, United Kingdom 3FZI Research Center for Information Technology, Karlsruhe, Germany. |
| Pseudocode | No | No explicit pseudocode or algorithm blocks found. |
| Open Source Code | Yes | The code is publicly available here. |
| Open Datasets | Yes | The datasets Credit and Cancer were taken from Nishihara et al. (2014). [...] The Ionosphere dataset (Sigillito et al., 1989) involves classifying radar signals [...] Similarly, the Sonar dataset (Gorman & Sejnowski, 1988) tackles the classification of sonar signals. [...] The Seeds data was collected by (Crowder, 1978). [...] The Brownian (d = 32) model [...] developed by (Sountsov et al., 2020). [...] The log Gaussian Cox process (LGCP) (Møller et al., 1998) is a probabilistic model [...]. [...] train NICE (Dinh et al., 2014) on a down-sampled 14x14 variant of MNIST (Digits) (Le Cun et al., 1998) and a 28x28 variant of Fashion MNIST (Fashion)... |
| Dataset Splits | No | The paper uses various datasets for sampling problems, but does not explicitly provide information on train/validation/test splits for these datasets within the context of their experiments. For example, for Digits and Fashion, they use a *trained model* as the target density, rather than performing a classification task with explicit splits. |
| Hardware Specification | No | D.B. acknowledges support by funding from the pilot program Core Informatics of the Helmholtz Association (HGF) and the state of Baden W urttemberg through bw HPC, as well as the Hore Ka supercomputer funded by the Ministry of Science, Research and the Arts Baden-W urttemberg and by the German Federal Ministry of Education and Research. |
| Software Dependencies | No | For GMMVI, we ported the tensorflow implementation of https://github.com/Oleg Arenz/gmmvi to Jax and integrated it into our framework. ... We updated the mean and the diagonal covariance matrix using the Adam optimizer (Kingma & Ba, 2014)... |
| Experiment Setup | Yes | In this section, we provide details on hyperparameter tuning. For further information, please refer to Appendix D. ... For MFVI, we used a batch size of 2000 and performed 100k gradient steps, tuning the learning rate via grid search. ... In SIS methods, we employed 2000 particles for training. All methods except FAB used 128 annealing steps... Diffusion-based Methods. Training involved a batch size of 2000 and 40k gradient steps. SDEs were discretized with 128 steps, T = 1, and a fixed Δt. Additionally, Table 8 provides a comprehensive list of specific hyperparameters for each method and target. |