A Characterization of Semi-Supervised Adversarially Robust PAC Learnability
Authors: Idan Attias, Steve Hanneke, Yishay Mansour
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the problem of learning an adversarially robust predictor to test time attacks in the semi-supervised PAC model. We address the question of how many labeled and unlabeled examples are required to ensure learning. We show that having enough unlabeled data (the size of a labeled sample that a fully-supervised method would require), the labeled sample complexity can be arbitrarily smaller compared to previous works, and is sharply characterized by a different complexity measure. We prove nearly matching upper and lower bounds on this sample complexity. |
| Researcher Affiliation | Collaboration | Idan Attias Department of Computer Science Ben-Gurion University of the Negev idanatti@post.bgu.ac.il Steve Hanneke Department of Computer Science Purdue University steve.hanneke@gmail.com Yishay Mansour Blavatnik School of Computer Science Tel Aviv University and Google Research mansour.yishay@gmail.com |
| Pseudocode | Yes | Algorithm 1 Generic Adversarially-Robust Semi-Supervised (GRASS) learner Input: Labeled data set Sl Dml, unlabeled data set Su X , hypothesis class H, perturbation function U, parameters , δ. Algorithms used: PAC learner A for partial concept classes, agnostic adversarially robust supervised PAC learner B. |
| Open Source Code | No | The paper is theoretical and does not present empirical experiments, therefore no source code for the methodology is provided. The ethics statement explicitly marks 'Did you include the code, data, and instructions needed to reproduce the main experimental results' as N/A. |
| Open Datasets | No | The paper is theoretical and does not involve experiments with specific public datasets. It discusses theoretical concepts of samples and distributions (e.g., 'labeled and unlabeled examples drawn i.i.d. from unknown distribution D'), but does not provide access information for any public dataset used for training. |
| Dataset Splits | No | The paper is theoretical and does not conduct empirical experiments with dataset splits. It defines theoretical sample complexity in terms of 'labeled examples' and 'unlabeled examples' but does not specify empirical training/validation/test splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any hardware used for experiments. The ethics statement explicitly marks relevant questions (3d) as N/A. |
| Software Dependencies | No | The paper is theoretical and does not describe any software dependencies with specific version numbers. The ethics statement explicitly marks relevant questions (3a) as N/A regarding reproducibility of experimental results. |
| Experiment Setup | No | The paper is theoretical and focuses on mathematical proofs and algorithms. It does not provide specific experimental setup details, hyperparameters, or training configurations. The ethics statement marks questions regarding experimental details as N/A. |