reproducibilityindex.ai

Feature selection using e-values

Authors: Subhabrata Majumdar, Snigdhansu Chatterjee

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement e-values using a GBS with scaled resample weights Wri Gamma(1, 1) 1, and resample sizes R = R1 = 1000. We use Mahalanobis depth for all depth calculations. Mahalanobis depth is much less computation-intensive than other depth functions (Dyckerhoff & Mozharovskyi, 2016; Liu & Zuo, 2014), but is not usually preferred in applications due to its non-robustness. However, we do not use any robustness properties of data depth, so are able to use it without any concern. For each replication for each data setting and method, we compute performance metrics on test datasets of the same dimensions as the respective training dataset. All our results are based on 1000 such replications.
Researcher Affiliation	Collaboration	1School of Statistics, University of Minnesota Twin Cities, Minneapolis, MN, USA 2Currently at Splunk. Correspondence to: Subhabrata Majumdar <smajumdar@splunk.com>.
Pseudocode	Yes	Algorithm 1 Best subset selection using e-values
Open Source Code	Yes	Code and data for the experiments in this paper are available at https://github.com/shubhobm/e-values.
Open Datasets	Yes	Indian monsoon data... obtain data on 35 potential covariates (see Appendix D) from National Climatic Data Center (NCDC) and National Oceanic and Atmospheric Administration (NOAA) repositories for 1978 2012.
Dataset Splits	Yes	We train our model on data from the years 1978-2002, run e-values best subset selection for tuning parameters τn {0.05, 0.1, . . . , 1}. We consider two methods to select the best refitted model: (a) minimizing GBIC(τn), and (b) minimizing forecasting errors on samples from 2003 2012.
Hardware Specification	Yes	All computations were performed on a Windows desktop with an 8-core Intel Core-i7 6700K 4GHz CPU and 16GB RAM.
Software Dependencies	No	The paper mentions statistical methods and distributions like 'GBS with scaled resample weights Wri Gamma(1, 1) 1' and 'Mahalanobis depth', but it does not provide specific software names with version numbers (e.g., Python 3.x, PyTorch 1.x, scikit-learn x.x.x).
Experiment Setup	Yes	We implement e-values using a GBS with scaled resample weights Wri Gamma(1, 1) 1, and resample sizes R = R1 = 1000.