Sparse Bayesian Learning via Stepwise Regression
Authors: Sebastian E. Ament, Carla P. Gomes
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report numerical experiments using a variety of feature selection algorithms. Notably, RMP and its limiting variant are both efficient and maintain strong performance with correlated features. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Cornell University, Ithaca, NY, USA. Correspondence to: Sebastian Ament <ament@cs.cornell.edu>. |
| Pseudocode | Yes | Algorithm 1 Relevance Matching Pursuit (RMPσ)... Algorithm 2 RMP0 |
| Open Source Code | Yes | 1Code made available at Compressed Sensing.jl. |
| Open Datasets | Yes | Figure 4 shows the mean test error as a function of sparsity on the UCI Boston housing data (Dua and Graff, 2017), which contains 506 data points. |
| Dataset Splits | Yes | The results are averaged over 4608 sparsity-error values for each algorithm, generated by evaluations for different tolerance parameters δ (i.e. σ) and random 75-25 train-test splits. |
| Hardware Specification | Yes | All experiments were run on a workstation with an Intel Xeon CPU X5670 and 47 GB of memory. |
| Software Dependencies | No | We implemented all algorithms in Julia (Bezanson et al., 2017), using the Ju MP framework (Dunning et al., 2017) to model the BP-approaches as second-order cone programs, and solve them using ECOS (Domahidi et al., 2013) with default settings. While software names and their corresponding citations are provided, specific version numbers for Julia, Ju MP, or ECOS are not mentioned in the text. |
| Experiment Setup | Yes | For the synthetic experiments, the weights x are random k-sparse vectors with 1 entries and the targets y were perturbed by random vectors distributed uniformly on the 10 2-hypersphere. For all algorithms, we input δ = 2 ϵ to simulate a small misspecification of the tolerance parameter that is likely to occur in practice. ... We used matrices of size 64 by 128 and l2-normalized the columns. |