reproducibilityindex.ai

Additive Approximations in High Dimensional Nonparametric Regression via the SALSA

Authors: Kirthevasan Kandasamy, Yaoliang Yu

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Via a comparison on 15 real datasets, we show that our method is competitive against 21 other alternatives.
Researcher Affiliation	Academia	Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
Pseudocode	No	The paper describes the algorithm mathematically and in text but does not present a formal pseudocode block.
Open Source Code	Yes	Our software and datasets are available at github.com/kirthevasank/salsa. Our implementation of locally polynomial regression is also released as part of this paper and is made available at github.com/kirthevasank/local-poly-reg.
Open Datasets	Yes	The datasets were taken from the UCI repository, Bristol Multilevel Modeling and the following sources: (Guillame-Bert et al., 2014; Just et al., 2010; Paschou, 2007; Tegmark et al, 2006; Tu, 2012; Wehbe et al., 2014).
Dataset Splits	Yes	For a given d we solve (1) for different λ and pick the best one via cross validation. To choose the optimal d we cross validate on d.
Hardware Specification	No	The paper does not provide specific hardware details such as CPU/GPU models, memory, or cloud instance types used for experiments.
Software Dependencies	No	We used software from (Chang & Lin, 2011; Hara & Chellappa, 2013; Jakabsons, 2015; Lin & Zhang, 2006; Rasmussen & Williams, 2006) or from Matlab.
Experiment Setup	Yes	In our experiments we set each ki to be a Gaussian kernel ki(xi, x i) = σY exp( (xi x i)2/2h2 i ) with bandwidth hi = cσin 1/5. Here σi is the standard deviation of the ith covariate and σY is the standard deviation of Y . The choice of bandwidth was inspired by several other kernel methods which use bandwidths on the order σin 1/5 (Ravikumar et al., 2009; Tsybakov, 2008). The constant c was hand tuned we found that performance was robust to choices between 5 and 60. In our experiments we use c = 20.