Efficient Mixture Learning in Black-Box Variational Inference

Authors: Alexandra Hotti, Oskar Kviman, Ricky Molén, Vı́ctor Elvira, Jens Lagergren

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimenting with MISVAE, we achieve astonishing, SOTA results on MNIST. Furthermore, we empirically validate our estimators in other BBVI settings, including Bayesian phylogenetic inference, where we improve inference times for the SOTA mixture model on eight data sets. In this section, we infer variational parameters using MISVAE along with the S2S, S2A, and A2A estimators. We conduct comparisons among these methods and against SOTA approaches across a synthetic dataset, three image datasets, and eight phylogenetic datasets.
Researcher Affiliation Collaboration 1KTH Royal Institute of Technology 2Science for Life Laboratory 3Klarna 4University of Edinburgh
Pseudocode No The paper does not contain pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes All code necessary to replicate our experiments is publicly available at: https://github.com/okviman/efficient-mixtures.
Open Datasets Yes We train on MNIST (Le Cun & Cortes, 2010), Fashion MNIST (Xiao et al., 2017), and CIFAR-10 (Krizhevsky et al.). We performed experiments on eight popular datasets for Bayesian phylo-genetics (Hedges et al., 1990; Garey et al., 1996; Yang & Yoder, 2003; Henk et al., 2003; Lakner et al., 2008; Zhang & Blackwell, 2001; Yoder & Yang, 2004; Rossman et al., 2001).
Dataset Splits No The paper does not explicitly provide training/test/validation dataset splits. It mentions using specific datasets, but not how they were partitioned into these subsets for the experiments described within this paper.
Hardware Specification Yes All experiments were conducted on a NVIDIA RTX 4090s with 24 Gi B of memory each using the PyTorch framework (Paszke et al., 2019).
Software Dependencies No The paper mentions "PyTorch framework (Paszke et al., 2019)" and "Adam (Kingma & Ba, 2017)" or "Adam (Kingma & Ba, 2014)". While PyTorch is a key software component, an explicit version number for the software (e.g., PyTorch 1.x) is not provided, only a citation to the paper that introduced it. Adam is an optimizer, not a software dependency with a specific version number that needs to be installed.
Experiment Setup Yes For optimization, we used Adam (Kingma & Ba, 2017), with a learning rate of 0.0005, and a batch size of 100 and initiated the process with a KL-warmup phase lasting 100 epochs. We optimized using Adam, with a learning rate of 0.001, and a batch size of 100 and initiated the training with KL-warmup during 500 epochs. The approximations were learned using the Adam optimizer (Kingma & Ba, 2014) with learning rate equal to 0.001.