Robustly Learning Single-Index Models via Alignment Sharpness
Authors: Nikos Zarifis, Puqian Wang, Ilias Diakonikolas, Jelena Diakonikolas
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We give an efficient learning algorithm, achieving a constant factor approximation to the optimal loss, that succeeds under a range of distributions (including log-concave distributions) and a broad class of monotone and Lipschitz link functions. This is the first efficient constant factor approximate agnostic learner, even for Gaussian data and for any nontrivial class of link functions. Prior work for the case of unknown link function either works in the realizable setting or does not attain constant factor approximation. The main technical ingredient enabling our algorithm and analysis is a novel notion of a local error bound in optimization that we term alignment sharpness and that may be of broader interest. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Wisconsin Madison, Madison, WI, USA. |
| Pseudocode | Yes | Algorithm 1 Optimization, Algorithm 2 Initialization, Algorithm 3 Optimization, Algorithm 4 Testing. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | No | The paper defines properties for data distributions ('(L, R)-well-behaved') and refers to 'i.i.d. samples from D' but does not specify any named, publicly available datasets or provide links/citations for data access. |
| Dataset Splits | No | The paper operates within a theoretical framework and does not provide specific details on dataset splits (e.g., percentages, sample counts, or predefined splits) for training, validation, or testing. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU or CPU models, memory specifications) used for computations or algorithm analysis. |
| Software Dependencies | No | The paper does not provide specific software dependencies, such as library names with version numbers, that would be needed to replicate the work. |
| Experiment Setup | No | The paper describes algorithmic parameters (e.g., step size η, batch size m, number of iterations T) as part of its theoretical design and analysis, but it does not provide details of an empirical experimental setup such as hyperparameters for training a model on real data. |