A Comparison of Hamming Errors of Representative Variable Selection Methods

Authors: Tracy Ke, Longlin Wang

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we compare Lasso with 5 other methods: Elastic net, SCAD, forward selection, thresholded Lasso, and forward backward selection. We measure their performances theoretically by the expected Hamming error, assuming that the regression coefficients are iid drawn from a two-point mixture and that the Gram matrix is block-wise diagonal. By deriving the rates of convergence of Hamming errors and the phase diagrams, we obtain useful conclusions about the pros and cons of different methods. ... In Experiments 1-3, (n, p) = (1000, 300). In Experiment 4, (n, p) = (500, 1000). ... The results are consistent with the theoretical phase diagrams (see Figure 1).
Researcher Affiliation Academia Zheng Tracy Ke Department of Statistics Harvard University Cambridge, MA 02138, USA zke@fas.harvard.edu Longlin Wang Department of Statistics Harvard University Cambridge, MA 02138, USA lwang2@fas.harvard.edu
Pseudocode No The paper describes algorithms but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link for open-source code for the methodology described.
Open Datasets No The paper states, "We generate (X, β) as in (3)-(4)", indicating synthetic data generation rather than the use of a publicly available dataset.
Dataset Splits No The paper describes running simulations over 50 or 500 repetitions and selecting ideal tuning parameters, but it does not specify traditional train/validation/test dataset splits for model training and evaluation.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper does not list specific software dependencies with version numbers used for the experiments.
Experiment Setup Yes For each method, we select the ideal tuning parameters that minimize the average Hamming error over 50 repetitions. ... We study the effect of tuning parameters in Lasso, thresholded Lasso (Thresh Lasso), forward selection (Forward Select), and forward backward selection (FB). In (a)-(b), we show the heatmap of averaged Hamming error (over 50 repetitions) of Thresh Lasso for a grid of (t, λ); when t = 0, it reduces to Lasso. In (c)-(d), we show the Hamming error of FB for a grid of (v, t); when v = 0, it reduces to Forward Select.