Robust Inference for High-Dimensional Linear Models via Residual Randomization

Authors: Y. Samuel Wang, Si Kai Lee, Panos Toulis, Mladen Kolar

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive simulations, we illustrate our method s wider range of applicability as suggested by theory. In particular, we show that our method outperforms state-of-art methods in challenging, yet more realistic, settings where the distribution of covariates is heavy-tailed or the sample size is small, while it remains competitive in standard, well behaved settings previously studied in the literature.
Researcher Affiliation Academia 1Booth School of Business, University of Chicago, Chicago, USA. Correspondence to: Y. Samuel Wang <swang24@uchicago.edu>, Si Kai Lee <sikai@uchicago.edu>.
Pseudocode Yes Algorithm 1 Test a>β = a0
Open Source Code Yes Additional details are in the supplement, and the code is available at: https://github.com/atechnicolorskye/rrHDI.
Open Datasets No The paper describes generating synthetic data for its experiments ('we sample random X Rn p with rows drawn i.i.d. from...', 'We sample the errors ε Rn from...') rather than using a publicly available dataset with concrete access information.
Dataset Splits No The paper generates synthetic data for its experiments and performs '1000 replications'. It does not describe a traditional train/validation/test split for a fixed dataset, nor does it provide details on how to reproduce such splits for its generated data.
Hardware Specification No The acknowledgments mention 'resources provided by the University of Chicago Research Computing Center' but do not provide specific details on CPU, GPU, memory, or cloud instance types used for the experiments.
Software Dependencies No The paper mentions 'fastclime package (Pang et al., 2014)' and 'RPtests (Shah & Buhlmann, 2017)' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes To obtain Mλ?, we solve (13) to up to 500 iterations using fastclime... We further select λ? via (12) by using a grid search over the λ values used by fastclime with δ = 10, 000. ... for the residual randomization procedure we employ the Square-Root Lasso... We follow (Zhang & q Cheng, 2017) and rescale εˆ by n/(n |βˆl|0) as a finite-sample correction. For all settings, we use 1, 000 group actions/bootstrap resamples.