Feature selection using e-values

Authors: Subhabrata Majumdar, Snigdhansu Chatterjee

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We implement e-values using a GBS with scaled resample weights Wri Gamma(1, 1) 1, and resample sizes R = R1 = 1000. We use Mahalanobis depth for all depth calculations. Mahalanobis depth is much less computation-intensive than other depth functions (Dyckerhoff & Mozharovskyi, 2016; Liu & Zuo, 2014), but is not usually preferred in applications due to its non-robustness. However, we do not use any robustness properties of data depth, so are able to use it without any concern. For each replication for each data setting and method, we compute performance metrics on test datasets of the same dimensions as the respective training dataset. All our results are based on 1000 such replications.
Researcher Affiliation Collaboration 1School of Statistics, University of Minnesota Twin Cities, Minneapolis, MN, USA 2Currently at Splunk. Correspondence to: Subhabrata Majumdar <smajumdar@splunk.com>.
Pseudocode Yes Algorithm 1 Best subset selection using e-values
Open Source Code Yes Code and data for the experiments in this paper are available at https://github.com/shubhobm/e-values.
Open Datasets Yes Indian monsoon data... obtain data on 35 potential covariates (see Appendix D) from National Climatic Data Center (NCDC) and National Oceanic and Atmospheric Administration (NOAA) repositories for 1978 2012.
Dataset Splits Yes We train our model on data from the years 1978-2002, run e-values best subset selection for tuning parameters τn {0.05, 0.1, . . . , 1}. We consider two methods to select the best refitted model: (a) minimizing GBIC(τn), and (b) minimizing forecasting errors on samples from 2003 2012.
Hardware Specification Yes All computations were performed on a Windows desktop with an 8-core Intel Core-i7 6700K 4GHz CPU and 16GB RAM.
Software Dependencies No The paper mentions statistical methods and distributions like 'GBS with scaled resample weights Wri Gamma(1, 1) 1' and 'Mahalanobis depth', but it does not provide specific software names with version numbers (e.g., Python 3.x, PyTorch 1.x, scikit-learn x.x.x).
Experiment Setup Yes We implement e-values using a GBS with scaled resample weights Wri Gamma(1, 1) 1, and resample sizes R = R1 = 1000.