Robustness Implies Generalization via Data-Dependent Generalization Bounds
Authors: Kenji Kawaguchi, Zhun Deng, Kyle Luh, Jiaoyang Huang
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments on real-world data and theoretical models demonstrate near-exponential improvements in various situations. To achieve these improvements, we do not require additional assumptions on the unknown distribution; instead, we only incorporate an observable and computable property of the training samples. A key technical innovation is an improved concentration bound for multinomial random variables that is of independent interest beyond robustness and generalization. |
| Researcher Affiliation | Academia | 1National University of Singapore 2Harvard University 3University of Colorado Boulder 4New York University. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | For real-world data, we adopted the standard benchmark datasets: MNIST (Le Cun et al., 1998), CIFAR-10 and CIFAR-100 (Krizhevsky & Hinton, 2009), SVHN (Netzer et al., 2011), Fashion-MNIST (FMNIST) (Xiao et al., 2017), Kuzushiji-MNIST (KMNIST) (Clanuwat et al., 2019), and Semeion (Srl & Brescia, 1994). |
| Dataset Splits | Yes | Following the literature on semi-supervised learning, we split the training data points into labeled data points (500 for Semeion and 5000 for all other datasets) and unlabeled data points (the remainder of the training data). |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | The data space is normalized such that X [0, 1]d for the dimensionality d of each input data. Accordingly, we used the infinity norm and a diameter of 0.1 for the ϵ-covering in all experiments. |