A Comparison of Hamming Errors of Representative Variable Selection Methods
Authors: Tracy Ke, Longlin Wang
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we compare Lasso with 5 other methods: Elastic net, SCAD, forward selection, thresholded Lasso, and forward backward selection. We measure their performances theoretically by the expected Hamming error, assuming that the regression coefficients are iid drawn from a two-point mixture and that the Gram matrix is block-wise diagonal. By deriving the rates of convergence of Hamming errors and the phase diagrams, we obtain useful conclusions about the pros and cons of different methods. ... In Experiments 1-3, (n, p) = (1000, 300). In Experiment 4, (n, p) = (500, 1000). ... The results are consistent with the theoretical phase diagrams (see Figure 1). |
| Researcher Affiliation | Academia | Zheng Tracy Ke Department of Statistics Harvard University Cambridge, MA 02138, USA zke@fas.harvard.edu Longlin Wang Department of Statistics Harvard University Cambridge, MA 02138, USA lwang2@fas.harvard.edu |
| Pseudocode | No | The paper describes algorithms but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link for open-source code for the methodology described. |
| Open Datasets | No | The paper states, "We generate (X, β) as in (3)-(4)", indicating synthetic data generation rather than the use of a publicly available dataset. |
| Dataset Splits | No | The paper describes running simulations over 50 or 500 repetitions and selecting ideal tuning parameters, but it does not specify traditional train/validation/test dataset splits for model training and evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers used for the experiments. |
| Experiment Setup | Yes | For each method, we select the ideal tuning parameters that minimize the average Hamming error over 50 repetitions. ... We study the effect of tuning parameters in Lasso, thresholded Lasso (Thresh Lasso), forward selection (Forward Select), and forward backward selection (FB). In (a)-(b), we show the heatmap of averaged Hamming error (over 50 repetitions) of Thresh Lasso for a grid of (t, λ); when t = 0, it reduces to Lasso. In (c)-(d), we show the Hamming error of FB for a grid of (v, t); when v = 0, it reduces to Forward Select. |