Robustly Learning Single-Index Models via Alignment Sharpness

Authors: Nikos Zarifis, Puqian Wang, Ilias Diakonikolas, Jelena Diakonikolas

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We give an efficient learning algorithm, achieving a constant factor approximation to the optimal loss, that succeeds under a range of distributions (including log-concave distributions) and a broad class of monotone and Lipschitz link functions. This is the first efficient constant factor approximate agnostic learner, even for Gaussian data and for any nontrivial class of link functions. Prior work for the case of unknown link function either works in the realizable setting or does not attain constant factor approximation. The main technical ingredient enabling our algorithm and analysis is a novel notion of a local error bound in optimization that we term alignment sharpness and that may be of broader interest.
Researcher Affiliation Academia 1Department of Computer Science, University of Wisconsin Madison, Madison, WI, USA.
Pseudocode Yes Algorithm 1 Optimization, Algorithm 2 Initialization, Algorithm 3 Optimization, Algorithm 4 Testing.
Open Source Code No The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository.
Open Datasets No The paper defines properties for data distributions ('(L, R)-well-behaved') and refers to 'i.i.d. samples from D' but does not specify any named, publicly available datasets or provide links/citations for data access.
Dataset Splits No The paper operates within a theoretical framework and does not provide specific details on dataset splits (e.g., percentages, sample counts, or predefined splits) for training, validation, or testing.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU or CPU models, memory specifications) used for computations or algorithm analysis.
Software Dependencies No The paper does not provide specific software dependencies, such as library names with version numbers, that would be needed to replicate the work.
Experiment Setup No The paper describes algorithmic parameters (e.g., step size η, batch size m, number of iterations T) as part of its theoretical design and analysis, but it does not provide details of an empirical experimental setup such as hyperparameters for training a model on real data.