Abstraction Mechanisms Predict Generalization in Deep Neural Networks
Authors: Alex Gain, Hava Siegelmann
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The CNA is highly predictive of generalization ability, outperforming norm-and-sharpness-based generalization metrics on an extensive evaluation of close to 200 network instances comprising a breadth of dataset-architecture combinations, especially in cases where additive noise is present and/or training labels are corrupted. |
| Researcher Affiliation | Academia | 1Department of Computer Science, The Johns Hopkins University, Baltimore, MD 21218, USA 2School of Computer and Information Sciences, University of Massachusetts Amherst, Amherst, MA 01003, USA. |
| Pseudocode | No | The paper provides a structured definition for the CNA in Figure 2, but it is presented as a definition overview rather than a formally labeled pseudocode or algorithm block. |
| Open Source Code | Yes | 1An implementation of this paper can be found on Git Hub: https://github.com/alexgain/cna-icml2020 |
| Open Datasets | Yes | The datasets include Image Net-32, CIFAR-10, CIFAR-100, MNIST, Fashion-MNIST, SVHN, corrupted labels counterparts (i.e. the same datasets with varyling levels training labels shuffled), and a random noise dataset. |
| Dataset Splits | No | The paper mentions 'validation datapoints' in the context of defining the margin metric, but it does not specify concrete train/validation/test splits (percentages, counts, or explicit standard split usage) for its experiments needed for reproduction. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models, memory specifications, or cloud instance types. |
| Software Dependencies | No | The paper does not list any specific software dependencies or their version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed for replication. |
| Experiment Setup | No | The paper mentions that 'All training details are included in the supplement,' implying that specific hyperparameters, optimizers, or other detailed setup configurations are not present in the main text. |