Robust Regression Revisited: Acceleration and Improved Estimation Rates
Authors: Arun Jambulapati, Jerry Li, Tselil Schramm, Kevin Tian
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We present nearly-linear time algorithms for robust regression problems with improved runtime or estimation guarantees compared to the state-of-the-art. Finally, we remove κ dependences in sample complexity statements, as they are subsumed by ϵ dependences due to assumed bounds on ϵκ or ϵκ2. Theorem 1 (informal, cf. Theorem 7, supplement). Suppose γ : R2 R is such that γy(v) := γ(v, y) is convex and has (absolute) first and second derivatives at most 1 for all y in the support of Dy, and DX has second moment matrix Σ LI. For some µ 0, let κ := max(1, L µ ) and let θ reg := argminθ Rd E (X,y) DXy {γ ( θ, X , y)} + µ be the solution to the true regularized statistical regression problem.2 There is an algorithm that given n := e O( d ϵ ) ϵ-corrupted samples from DXy, for ϵκ2 at most an absolute constant, obtains θ with θ θ reg 2 = O q µ with high probability in time e O(nd κ). |
| Researcher Affiliation | Collaboration | Arun Jambulapati Stanford University jmblpati@stanford.edu Jerry Li Microsoft Research jerrl@microsoft.com Tselil Schramm Stanford University tselil@stanford.edu Kevin Tian Stanford University kjtian@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Half Radius Accel( θ, R, Onp) and Algorithm 2 Half Radius Lin Reg(X, y, w, θ, R, OERM) |
| Open Source Code | No | The paper does not contain any explicit statements about providing open-source code for the methodology described. The reproducibility checklist states 'N/A' for inclusion of code. |
| Open Datasets | No | The paper is theoretical and focuses on algorithm design and analysis. It does not describe experiments using specific publicly available datasets for training, referring instead to a statistical model of 'ϵ-corrupted samples from DXy'. |
| Dataset Splits | No | The paper is theoretical and does not describe specific dataset splits (training, validation, testing) as it does not conduct empirical experiments. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for running experiments. The checklist for 'If you ran experiments' is marked 'N/A'. |
| Software Dependencies | No | The paper is theoretical and does not list specific software dependencies with version numbers. The checklist for 'If you ran experiments' is marked 'N/A'. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with specific hyperparameters or training details. The checklist for 'If you ran experiments' is marked 'N/A'. |