reproducibilityindex.ai

Distribution-Independent PAC Learning of Halfspaces with Massart Noise

Authors: Ilias Diakonikolas, Themis Gouleakis, Christos Tzamos

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study the problem of distribution-independent PAC learning of halfspaces in the presence of Massart noise. Speciﬁcally, we are given a set of labeled examples (x, y) drawn from a distribution D on Rd+1 such that the marginal distribution on the unlabeled points x is arbitrary and the labels y are generated by an unknown halfspace corrupted with Massart noise at noise rate η < 1/2. The goal is to ﬁnd a hypothesis h that minimizes the misclassiﬁcation error Pr(x,y) D [h(x) = y]. We give a poly (d, 1/ϵ) time algorithm for this problem with misclassiﬁcation error η + ϵ. We also provide evidence that improving on the error guarantee of our algorithm might be computationally hard.
Researcher Affiliation	Academia	Ilias Diakonikolas University of Wisconsin-Madison ilias@cs.wisc.edu Themis Gouleakis Max Planck Institute for Informatics tgouleak@mpi-inf.mpg.de Christos Tzamos University of Wisconsin-Madison tzamos@wisc.edu
Pseudocode	Yes	Algorithm 1 Main Algorithm (with margin) and Algorithm 2 Main Algorithm (general case) are provided with structured steps.
Open Source Code	No	The paper does not include any explicit statement about making the source code for their methodology publicly available, nor does it provide a link to a code repository.
Open Datasets	No	The paper discusses learning from 'i.i.d. examples from a distribution D' and 'arbitrary distribution over X', but it does not specify or provide access information for any publicly available or open dataset. The work is theoretical.
Dataset Splits	No	The paper does not specify any training, validation, or test dataset splits. The work focuses on theoretical algorithms and their guarantees rather than empirical evaluation.
Hardware Specification	No	The paper does not mention any specific hardware used to run experiments or algorithms. As a theoretical paper, such details are not relevant or provided.
Software Dependencies	No	The paper does not list specific software components with version numbers needed for reproducibility. It is a theoretical paper focusing on algorithms.
Experiment Setup	No	The paper does not provide specific details about an experimental setup, such as hyperparameters or training settings. This is consistent with its theoretical nature, focusing on algorithmic properties rather than empirical implementation specifics.