reproducibilityindex.ai

Robust Regression Revisited: Acceleration and Improved Estimation Rates

Authors: Arun Jambulapati, Jerry Li, Tselil Schramm, Kevin Tian

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We present nearly-linear time algorithms for robust regression problems with improved runtime or estimation guarantees compared to the state-of-the-art. Finally, we remove κ dependences in sample complexity statements, as they are subsumed by ϵ dependences due to assumed bounds on ϵκ or ϵκ2. Theorem 1 (informal, cf. Theorem 7, supplement). Suppose γ : R2 R is such that γy(v) := γ(v, y) is convex and has (absolute) ﬁrst and second derivatives at most 1 for all y in the support of Dy, and DX has second moment matrix Σ LI. For some µ 0, let κ := max(1, L µ ) and let θ reg := argminθ Rd E (X,y) DXy {γ ( θ, X , y)} + µ be the solution to the true regularized statistical regression problem.2 There is an algorithm that given n := e O( d ϵ ) ϵ-corrupted samples from DXy, for ϵκ2 at most an absolute constant, obtains θ with θ θ reg 2 = O q µ with high probability in time e O(nd κ).
Researcher Affiliation	Collaboration	Arun Jambulapati Stanford University jmblpati@stanford.edu Jerry Li Microsoft Research jerrl@microsoft.com Tselil Schramm Stanford University tselil@stanford.edu Kevin Tian Stanford University kjtian@stanford.edu
Pseudocode	Yes	Algorithm 1 Half Radius Accel( θ, R, Onp) and Algorithm 2 Half Radius Lin Reg(X, y, w, θ, R, OERM)
Open Source Code	No	The paper does not contain any explicit statements about providing open-source code for the methodology described. The reproducibility checklist states 'N/A' for inclusion of code.
Open Datasets	No	The paper is theoretical and focuses on algorithm design and analysis. It does not describe experiments using specific publicly available datasets for training, referring instead to a statistical model of 'ϵ-corrupted samples from DXy'.
Dataset Splits	No	The paper is theoretical and does not describe specific dataset splits (training, validation, testing) as it does not conduct empirical experiments.
Hardware Specification	No	The paper is theoretical and does not describe any specific hardware used for running experiments. The checklist for 'If you ran experiments' is marked 'N/A'.
Software Dependencies	No	The paper is theoretical and does not list specific software dependencies with version numbers. The checklist for 'If you ran experiments' is marked 'N/A'.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with specific hyperparameters or training details. The checklist for 'If you ran experiments' is marked 'N/A'.