Input Margins Can Predict Generalization Too

Authors: Coenraad Mouton, Marthinus Wilhelmus Theunissen, Marelie H Davel

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The predictive power of this new measure is demonstrated on the Predicting Generalization in Deep Learning (PGDL) dataset and contrasted with hidden representation margins. We find that constrained margins achieve highly competitive scores and outperform other margin measurements in general.
Researcher Affiliation Collaboration 1Faculty of Engineering, North-West University, South Africa 2Centre for Artificial Intelligence Research, South Africa 3South African National Space Agency 4National Institute for Theoretical and Computational Sciences, South Africa
Pseudocode Yes Algorithm 1: Deep Fool constrained margin calculation
Open Source Code No The paper mentions supplementary material for mathematical derivations ('The derivation of Equation (5) is included in the supplementary material.') but does not explicitly state that the source code for the described methodology is available, nor does it provide a direct link to a code repository.
Open Datasets Yes The Predicting Generalization in Deep Learning (PGDL) challenge, exemplifies such an approach. The challenge was held at NeurIPS 2020 (Jiang et al. 2020)
Dataset Splits No The paper references tasks being split into prototyping/tuning and held-out sets (e.g., 'Tasks 1, 2, 4, and 5 were available for prototyping and tuning complexity measures, while Task 6 to 9 were used as a held-out set.'), but it does not provide specific details on train/validation/test splits for the datasets themselves (e.g., percentages or sample counts).
Hardware Specification Yes calculating the entire constrained margin distribution only takes 1 to 2 minutes per model on a single Nvidia A30.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies used in the experiments (e.g., programming languages, frameworks, or libraries).
Experiment Setup Yes Empirically, we find that the technique is not very sensitive with regard to the selection of hyperparameters and a single learning rate (γ = 0.25) and max iterations (max = 100) is used across all experiments. Furthermore, we use the same distance tolerance (δ = 0.01) for all tasks, except for Tasks 4 and 5, which require a smaller tolerance (δ = 0.001).