Robust Sparse Regression with Non-Isotropic Designs

Authors: Chih-Hung Liu, Gleb Novikov

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We develop a technique to design efficiently computable estimators for sparse linear regression in the simultaneous presence of two adversaries: oblivious and adaptive. Consider the model 𝑦 = 𝑋 𝛽 + πœ‚where 𝑋 is an 𝑛 𝑑random design matrix, 𝛽 R𝑑is a π‘˜-sparse vector, and the noise πœ‚is independent of 𝑋 and chosen by the oblivious adversary. Apart from the independence of 𝑋 , we only require a small fraction entries of πœ‚to have magnitude at most 1. The adaptive adversary is allowed to arbitrarily corrupt an πœ€-fraction of the samples (𝑋 1, 𝑦 1), . . . , (𝑋 𝑛, 𝑦 𝑛). Given the πœ€-corrupted samples (𝑋1, 𝑦1), . . . , (𝑋𝑛, 𝑦𝑛), the goal is to estimate 𝛽 . We assume that the rows of 𝑋 are iid samples from some 𝑑-dimensional distribution π’Ÿwith zero mean and (unknown) covariance matrix Ξ£ with bounded condition number. We design several robust algorithms that outperform the state of the art even in the special case of Gaussian noise πœ‚ 𝑁(0, 1)𝑛. In particular, we provide a polynomial-time algorithm that with high probability recovers 𝛽 up to error 𝑂( πœ€) as long as 𝑛 𝑂(π‘˜2/πœ€), only assuming some bounds on the third and the fourth moments of π’Ÿ. In addition, prior to this work, even in the special case of Gaussian design π’Ÿ= 𝑁(0, Ξ£) and noise πœ‚ 𝑁(0, 1), no polynomial time algorithm was known to achieve error π‘œ( πœ€) in the sparse setting 𝑛< 𝑑2. We show that under some assumptions on the fourth and the eighth moments of π’Ÿ, there is a polynomial-time algorithm that achieves error π‘œ( πœ€) as long as 𝑛 𝑂(π‘˜4/πœ€3). For Gaussian distribution π’Ÿ= 𝑁(0, Ξ£), this algorithm achieves error 𝑂(πœ€3/4). Moreover, our algorithm achieves error π‘œ( πœ€) for all log-concave distributions if πœ€ 1/polylog(d). Our algorithms are based on the filtering of the covariates that uses sum-of-squares relaxations, and weighted Huber loss minimization with β„“1 regularizer. We provide a novel analysis of weighted penalized Huber loss that is suitable for heavy-tailed designs in the presence of two adversaries. Furthermore, we complement our algorithmic results with Statistical Query lower bounds, providing evidence that our estimators are likely to have nearly optimal sample complexity.
Researcher Affiliation Academia Chih-Hung Liu Department of Electrical Engineering National Taiwan University chliu@ntu.edtw Gleb Novikov Lucerne School of Computer Science and Information Technology gleb.novikov@hslu.ch
Pseudocode Yes Algorithm C.1 (Filtering algorithm).
Open Source Code No The NeurIPS checklist for the paper indicates: 'This is a theory paper, and does not include experiments.' There is no explicit statement or link in the paper providing open-source code for the described methodology.
Open Datasets No The NeurIPS checklist for the paper indicates: 'This is a theory paper, and does not include experiments.' The paper does not use or provide access to any specific public datasets for training.
Dataset Splits No The NeurIPS checklist for the paper indicates: 'This is a theory paper, and does not include experiments.' The paper does not provide details on training, validation, or test dataset splits.
Hardware Specification No The NeurIPS checklist for the paper indicates: 'This is a theory paper, and does not include experiments.' No hardware specifications are provided.
Software Dependencies No The NeurIPS checklist for the paper indicates: 'This is a theory paper, and does not include experiments.' No specific software dependencies with version numbers are mentioned for replication.
Experiment Setup No The NeurIPS checklist for the paper indicates: 'This is a theory paper, and does not include experiments.' As such, no experimental setup details like hyperparameters are provided.