Robust Inference for High-Dimensional Linear Models via Residual Randomization
Authors: Y. Samuel Wang, Si Kai Lee, Panos Toulis, Mladen Kolar
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive simulations, we illustrate our method s wider range of applicability as suggested by theory. In particular, we show that our method outperforms state-of-art methods in challenging, yet more realistic, settings where the distribution of covariates is heavy-tailed or the sample size is small, while it remains competitive in standard, well behaved settings previously studied in the literature. |
| Researcher Affiliation | Academia | 1Booth School of Business, University of Chicago, Chicago, USA. Correspondence to: Y. Samuel Wang <swang24@uchicago.edu>, Si Kai Lee <sikai@uchicago.edu>. |
| Pseudocode | Yes | Algorithm 1 Test a>β = a0 |
| Open Source Code | Yes | Additional details are in the supplement, and the code is available at: https://github.com/atechnicolorskye/rrHDI. |
| Open Datasets | No | The paper describes generating synthetic data for its experiments ('we sample random X Rn p with rows drawn i.i.d. from...', 'We sample the errors ε Rn from...') rather than using a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper generates synthetic data for its experiments and performs '1000 replications'. It does not describe a traditional train/validation/test split for a fixed dataset, nor does it provide details on how to reproduce such splits for its generated data. |
| Hardware Specification | No | The acknowledgments mention 'resources provided by the University of Chicago Research Computing Center' but do not provide specific details on CPU, GPU, memory, or cloud instance types used for the experiments. |
| Software Dependencies | No | The paper mentions 'fastclime package (Pang et al., 2014)' and 'RPtests (Shah & Buhlmann, 2017)' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | To obtain Mλ?, we solve (13) to up to 500 iterations using fastclime... We further select λ? via (12) by using a grid search over the λ values used by fastclime with δ = 10, 000. ... for the residual randomization procedure we employ the Square-Root Lasso... We follow (Zhang & q Cheng, 2017) and rescale εˆ by n/(n |βˆl|0) as a finite-sample correction. For all settings, we use 1, 000 group actions/bootstrap resamples. |