Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Logistic Variational Bayes Revisited
Authors: Michael Komodromos, Marina Evangelou, Sarah Lucie Filippi
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive simulations we demonstrate that VI-PER leads to more accurate posterior approximations and improves on the well known issue of variance underestimation within the variational posterior, which can be of critical importance in real world applications as demonstrated in Section 4 (Blei et al., 2017; Durante & Rigon, 2019).In this section a numerical evaluation of our method taking l = 12 is performed. Referring to our method as Variational Inference with Probabilistic Error Reduction (VI PER), we compare against the Polya-Gamma formulation (VI PG) [which is a probabilistic interpretation of the bound introduced by Jaakkola & Jordan (2000)] and the ELBO computed via Monte-Carlo (VI MC) using 1,000 samples. |
| Researcher Affiliation | Academia | Michael Komodromos 1 Marina Evangelou 1 Sarah Filippi 1 1Department of Mathematics, Imperial College London, United Kingdom. |
| Pseudocode | No | The paper describes mathematical derivations and detailed steps of the method, but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The implementation is freely available at https://github.com/mkomod/vi-per. |
| Open Datasets | Yes | To model soil liquefaction we use data from a study by Zhan et al. (2023), which was accessed with permission of the author and will be publicly available in the near future.Here logistic Gaussian Process classification is applied to a number of publicly available datasets, all of which accessible through UCI or the LIBSVM package (Chang & Lin, 2011). |
| Dataset Splits | Yes | For each dataset we use the first 80% of the data for training and the remaining 20% for testing (when a testing set is not available). |
| Hardware Specification | Yes | Hardware Information (Configuration 1) CPU: AMD EPYC 7742 64-Core Processor CPU Cores: 256 Hardware Information (Configuration 2) CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz CPU Cores: 48 |
| Software Dependencies | Yes | python 3.11.5 pytorch 2.1.0 gpytorch 1.10 hamiltorch 0.4.1 torcheval 0.0.7 numpy 1.26.0 matplotlib 3.7.2 geopandas 0.14.1 pandas 2.1.3 |
| Experiment Setup | Yes | Regarding the initialization of ยต and ฮฃ, the mean vector ยต is sampled from a Gaussian distribution with zero mean and identity covariance matrix, and ฮฃ = 0.35Ip.For our sampler we use 30,000 iterations and a burn-in of 25,000 iterations. The step size is set to 0.01 and the number of leapfrog steps is set to 25.In all our applications we consider a GP model with M = 50 inducing points, linear mean function and ARD kernel with lengthscales initialized at 0.5. |