reproducibilityindex.ai

Input Margins Can Predict Generalization Too

Authors: Coenraad Mouton, Marthinus Wilhelmus Theunissen, Marelie H Davel

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The predictive power of this new measure is demonstrated on the Predicting Generalization in Deep Learning (PGDL) dataset and contrasted with hidden representation margins. We find that constrained margins achieve highly competitive scores and outperform other margin measurements in general.
Researcher Affiliation	Collaboration	1Faculty of Engineering, North-West University, South Africa 2Centre for Artificial Intelligence Research, South Africa 3South African National Space Agency 4National Institute for Theoretical and Computational Sciences, South Africa
Pseudocode	Yes	Algorithm 1: Deep Fool constrained margin calculation
Open Source Code	No	The paper mentions supplementary material for mathematical derivations ('The derivation of Equation (5) is included in the supplementary material.') but does not explicitly state that the source code for the described methodology is available, nor does it provide a direct link to a code repository.
Open Datasets	Yes	The Predicting Generalization in Deep Learning (PGDL) challenge, exemplifies such an approach. The challenge was held at NeurIPS 2020 (Jiang et al. 2020)
Dataset Splits	No	The paper references tasks being split into prototyping/tuning and held-out sets (e.g., 'Tasks 1, 2, 4, and 5 were available for prototyping and tuning complexity measures, while Task 6 to 9 were used as a held-out set.'), but it does not provide specific details on train/validation/test splits for the datasets themselves (e.g., percentages or sample counts).
Hardware Specification	Yes	calculating the entire constrained margin distribution only takes 1 to 2 minutes per model on a single Nvidia A30.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies used in the experiments (e.g., programming languages, frameworks, or libraries).
Experiment Setup	Yes	Empirically, we find that the technique is not very sensitive with regard to the selection of hyperparameters and a single learning rate (γ = 0.25) and max iterations (max = 100) is used across all experiments. Furthermore, we use the same distance tolerance (δ = 0.01) for all tasks, except for Tasks 4 and 5, which require a smaller tolerance (δ = 0.001).