Efficient List-Decodable Regression using Batches
Authors: Abhimanyu Das, Ayush Jain, Weihao Kong, Rajat Sen
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We demonstrate the use of batches in studying list-decodable linear regression, in which only α (0, 1] fraction of batches contain genuine samples from a common distribution and the rest can contain arbitrary or even adversarial samples. When genuine batches have Ω(1/α) samples each, our algorithm can efficiently find a small list of potential regression parameters, with a high probability that one of them is close to the true parameter. This is the first polynomial time algorithm for list-decodable linear regression, and its sample complexity scales nearly linearly with the dimension of the covariates. |
| Researcher Affiliation | Collaboration | 1Google Research 2UC San Diego. The majority of this work was completed while Ayush Jain was an intern at Google Research. Correspondence to: Ayush Jain <ayjain@ucsd.edu>. |
| Pseudocode | Yes | Algorithm 1 MAINALGORITHM 1: Input: Data {{(xb i, yb i )}i [n]}b B, α, C, σ. 2: For each b B, βb init 1 and βinit {βb init}b B. 3: List L {βinit} and M . 4: while L = do 5: Pick any element β in L and remove it from L. [...] Algorithm 2 FINDCLIPPINGPPARAMETER 1: Input: Set B, β, σ, a1 1, a2 data {{(xb i, yb i }i [n]}b B. 2: κ 3: while True do 4: wκ any approximate stationary point of clipped losses {f b( , κ)} w.r.t. weight vector β such that Eβ[f b(wκ, κ)] log(2/α)σ |
| Open Source Code | No | The paper does not provide any statements about releasing code, nor does it include links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not use or reference any specific public or open datasets for training. |
| Dataset Splits | No | The paper is theoretical and does not describe empirical experiments involving dataset splits (e.g., training, validation, test splits). |
| Hardware Specification | No | The paper is theoretical and does not describe empirical experiments; therefore, no hardware specifications are mentioned. |
| Software Dependencies | No | The paper does not mention any specific software dependencies or version numbers needed to replicate the work. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm design and proofs. It does not describe any experimental setup details such as hyperparameters or system-level training settings. |