reproducibilityindex.ai

Sample Complexity of Robust Linear Classification on Separated Data

Authors: Robi Bhattacharjee, Somesh Jha, Kamalika Chaudhuri

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Most prior theoretical results for this problem have considered a setting where different classes in the data are close together or overlapping. We consider, in contrast, the well-separated case where there exists a classiﬁer with perfect accuracy and robustness, and show that the sample complexity narrates an entirely different story. Speciﬁcally, for linear classiﬁers, we show a large class of well-separated distributions where the expected robust loss of any algorithm is at least Ω( d n), whereas the max margin algorithm has expected standard loss O( 1 n). This shows a gap in the standard and robust losses that cannot be obtained via prior techniques. Additionally, we present an algorithm that, given an instance where the robustness radius is much smaller than the gap between the classes, gives a solution with expected robust loss is O( 1 n). This shows that for very well-separated data, convergence rates of O( 1 n) are achievable, which is not the case otherwise. Our results apply to robustness measured in any ℓp norm with p > 1 (including p = ).
Researcher Affiliation	Academia	1University of California, San Diego 2University of Wisconsin-Madison. Correspondence to: Robi Bhattacharjee <rcbhatta@eng.ucsd.edu>.
Pseudocode	Yes	Algorithm 1 Adversarial-Perceptron and Algorithm 2 Adversarial-Kernel-Perceptron are provided.
Open Source Code	No	The paper does not provide any specific links or statements about the availability of open-source code for the methodology described.
Open Datasets	No	The paper is theoretical and discusses abstract data distributions (D) rather than specific datasets for training. It does not mention any publicly available or open datasets.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments with training, validation, or test splits. Therefore, no information on dataset splits for reproduction is provided.
Hardware Specification	No	The paper is theoretical and does not describe specific hardware used for experiments. No GPU, CPU, or other hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not specify software dependencies with version numbers needed for reproducibility.
Experiment Setup	No	The paper is theoretical and presents algorithms and analyses. It does not describe an empirical experimental setup with hyperparameters or system-level training settings.