Fast Laplace Approximation for Sparse Bayesian Spike and Slab Models

Authors: Syed Abbas Z. Naqvi, Shandian Zhe, Yuan Qi, Yifan Yang, Jieping Ye

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments 5.1 Simulation First we examine our method in a simulation study. ... We compare our approach with alternative approximate inference algorithms ... Results. Figures 1 a and e show the predictive performance of all the methods for regression and classification. ... Table 1 lists the average prediction accuracy and standard errors on the original datasets.
Researcher Affiliation Academia Syed Abbas Z. Naqvi,1 Shandian Zhe,1 Yuan Qi,1 Yifan Yang,2 and Jieping Ye3 1Department of Computer Science, Purdue University 2Department of Biology, Purdue University 3Department of EECS, University of Michigan, Ann Arbor
Pseudocode No The paper describes computational steps but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper mentions third-party software packages (Glmnet, Gist) but does not provide any statement or link for the open-source code of the method described in this paper.
Open Datasets Yes We then examine all the algorithms on 14 published large real datasets, including 8 classification datasets3 and 6 regression datasets: Diffuse large B cell lymphoma (DLBCL) [Rosenwald et al., 2002], GSE5680 [Scheetz et al., 2006], Yearprediction4(Year), House-census5(House), 10K corpus [Kogan et al., 2009] and TIED6. Footnotes 3, 4, 5, 6 provide URLs: 3www.shi-zhong.com/software/docdata.zip 4archive.ics.uci.edu/ml/datasets.html 5www.cs.toronto.edu/ delve/data/census-house/desc.html 6www.causality.inf.ethz.ch/repository.php
Dataset Splits Yes We fix the number of test samples to 200 and vary the number of training samples n from {60, 80, 100, 120}. For each n, we randomly generate 50 datasets and report the average results. ... We randomly split each dataset into two parts 10% samples for training and the rest for test for 10 times and run all the methods on each partition. In each run, we use 10-fold cross validation on the training data to tune the free parameters.
Hardware Specification No No specific hardware (e.g., GPU/CPU models, memory details) used for running the experiments is mentioned in the paper.
Software Dependencies No The paper mentions 'Glmnet software package' and 'Gist software package' but does not specify version numbers for these or other software dependencies.
Experiment Setup Yes For our methods, we use the solution of L2 regularization as the initialization point. The variances for spike and slab components, i.e., r0 and r1 are chosen from cross validation. The grids used are r0 = [10 6, 10 5, 10 4, 10 3] and r1 = [1 : 1 : 5]. ... In the step of using Nystr om approach to calculate Laplace approximation, we sample 5 columns for each Nystr om approximation and repeat 5 times for ensemble estimation of the inverse Hessian diagonal.