Input Margins Can Predict Generalization Too
Authors: Coenraad Mouton, Marthinus Wilhelmus Theunissen, Marelie H Davel
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The predictive power of this new measure is demonstrated on the Predicting Generalization in Deep Learning (PGDL) dataset and contrasted with hidden representation margins. We find that constrained margins achieve highly competitive scores and outperform other margin measurements in general. |
| Researcher Affiliation | Collaboration | 1Faculty of Engineering, North-West University, South Africa 2Centre for Artificial Intelligence Research, South Africa 3South African National Space Agency 4National Institute for Theoretical and Computational Sciences, South Africa |
| Pseudocode | Yes | Algorithm 1: Deep Fool constrained margin calculation |
| Open Source Code | No | The paper mentions supplementary material for mathematical derivations ('The derivation of Equation (5) is included in the supplementary material.') but does not explicitly state that the source code for the described methodology is available, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | The Predicting Generalization in Deep Learning (PGDL) challenge, exemplifies such an approach. The challenge was held at NeurIPS 2020 (Jiang et al. 2020) |
| Dataset Splits | No | The paper references tasks being split into prototyping/tuning and held-out sets (e.g., 'Tasks 1, 2, 4, and 5 were available for prototyping and tuning complexity measures, while Task 6 to 9 were used as a held-out set.'), but it does not provide specific details on train/validation/test splits for the datasets themselves (e.g., percentages or sample counts). |
| Hardware Specification | Yes | calculating the entire constrained margin distribution only takes 1 to 2 minutes per model on a single Nvidia A30. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies used in the experiments (e.g., programming languages, frameworks, or libraries). |
| Experiment Setup | Yes | Empirically, we find that the technique is not very sensitive with regard to the selection of hyperparameters and a single learning rate (γ = 0.25) and max iterations (max = 100) is used across all experiments. Furthermore, we use the same distance tolerance (δ = 0.01) for all tasks, except for Tasks 4 and 5, which require a smaller tolerance (δ = 0.001). |