Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
Authors: Colin Wei, Tengyu Ma
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As a complement to the main theoretical results in this paper, we show empirically in Section 6 that directly regularizing our complexity measure can result in improved test performance. We provide preliminary experiments demonstrating that the proposed complexity measure and generalization bounds are empirically relevant. We show that regularizing the complexity measure leads to better test accuracy. |
| Researcher Affiliation | Academia | Colin Wei Computer Science Department Stanford University colinwei@stanford.edu Tengyu Ma Computer Science Department Stanford University tengyuma@stanford.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., repository link, explicit statement of code release) for the source code of the methodology described. |
| Open Datasets | Yes | Figure 1 shows the results for models trained and tested on CIFAR10 in low learning rate and no data augmentation settings, which are settings where generalization typically suffers. Table 1: Test error for a model trained on CIFAR10 in various settings. |
| Dataset Splits | No | The paper mentions training and testing on CIFAR10 but does not specify the exact train/validation/test splits, percentages, or methodology for splitting the dataset. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) used for running the experiments are provided in the paper. |
| Software Dependencies | No | The paper does not provide specific software dependencies (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | The threshold on the Frobenius norm in the regularization is inspired by the truncations in our augmented loss (in all our experiments, we choose σ = 0.1). We tune the coefficient λ as a hyperparameter. In our experiments, we took the regularized indices i to be last layers in each residual block as well as layers in residual blocks following a Batch Norm in the standard Wide Res Net16 architecture. In the Layer Norm setting, we simply replaced Batch Norm layers with Layer Norm. The remaining hyperparameter settings are standard for Wide Res Net; for additional details see Section I.1. |