Learning General Halfspaces with Adversarial Label Noise via Online Gradient Descent

Authors: Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Our main result is a simple gradient-based iteration that efficiently converges to a halfspace with error O(ϵ) after only poly log(1/ϵ) rounds. Theorem 4.1 (Online Gradient Descent Learner). Fix ϵ, δ (0, 1/2) and let D be an ϵ-corrupted distribution on Rd { 1} with standard normal x-marginal. Denote by b D the empirical distribution formed with N = O( d log(1/δ) ϵ2 ) samples from D. Then the Online Gradient Descent Algorithm 2, after T = O(log4(1/ϵ)) iterations, returns a vector w(T ) and a threshold t such that, with probability at least 1 δ, it holds Pr(x,y) D[sign(w(T ) x + t) = y] Cϵ, where C > 0 is some universal constant.
Researcher Affiliation Academia 1Department of Computer Sciences, University of Wisconsin, Madison, Wisconsin, USA. Correspondence to: Vasilis Kontonis <kontonis@wisc.edu>, Nikos Zarifis <zarifis@wisc.edu>.
Pseudocode Yes Algorithm 1 Online Projected Gradient Descent (OPGD) Algorithm 2 Online Projected Gradient Descent Learner
Open Source Code No The paper does not provide any concrete access information (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets No The paper refers to an "ϵ-corrupted distribution on Rd { 1} with standard normal x-marginal" and "N samples" but does not specify a named public dataset with access information (link, DOI, formal citation).
Dataset Splits No This is a theoretical paper and does not include details on dataset splits (train/validation/test) for empirical experiments.
Hardware Specification No The paper is theoretical and does not specify any hardware (GPU, CPU models, etc.) used for running experiments.
Software Dependencies No The paper describes a theoretical algorithm and its analysis, but does not specify any software dependencies with version numbers.
Experiment Setup No The paper defines parameters for its theoretical algorithm (e.g., step size, number of iterations) but does not provide specific experimental setup details such as hyperparameters, optimizer settings, or training configurations for an empirical study.