reproducibilityindex.ai

LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations

Authors: Brian Trippe, Jonathan Huggins, Raj Agrawal, Tamara Broderick

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments support our theory and demonstrate the efﬁcacy of LR-GLM on real large-scale datasets.
Researcher Affiliation	Academia	1Computer Science and Artiﬁcial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 2Department of Biostatistics, Harvard, Cambridge, MA.
Pseudocode	Yes	Algorithm 1 LR-Laplace for Bayesian inference in GLMs with low-rank data approximations and zero-mean prior with computation costs. See Appendix H for the general algorithm.
Open Source Code	No	The paper does not contain any statement about making its source code available or provide a link to a code repository.
Open Datasets	Yes	The ﬁrst is the UCI Farm-Ads dataset, which consists of N = 4,143 online advertisements for animal-related topics together with binary labels indicating whether the content provider approved of the ad; there are D = 54,877 bag-of-words features per ad (Dheeru & Karra Taniskidou, 2017). As a second real dataset we evaluated our approach on the Reuters RCV1 text categorization test collection (Amini et al., 2009; Chang & Lin, 2011). RCV1 consists of D = 47,236 bag-of-words features for N = 20,241 English documents grouped into two different categories.
Dataset Splits	No	The paper mentions datasets used but does not provide specific percentages or sample counts for training, validation, or test splits.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance specifications used for running experiments.
Software Dependencies	No	The paper mentions software tools like 'Stan' and 'py Stan' but does not provide specific version numbers for these or other software dependencies, which are necessary for reproducible descriptions.
Experiment Setup	Yes	For synthetic data experiments, we considered logistic regression with covariates of dimension D = 250 and D = 500. In each replicate, we generated the latent parameter from an isotropic Gaussian prior, β N(0, ID), correlated covariates from a multivariate Gaussian, and responses from the logistic regression likelihood (see Appendix A.1 for details)... As a practical rule of thumb, we recommend setting M to be as large as is allowable for the given application without the resulting inference becoming too slow. For our experiments with LR-Laplace, this limit was M 20,000.