The Memory-Perturbation Equation: Understanding Model's Sensitivity to Data

Authors: Peter Nickl, Lu Xu, Dharmesh Tailor, Thomas Möllenhoff, Mohammad Emtiyaz Khan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical results show that sensitivity estimates obtained during training can be used to faithfully predict generalization on unseen test data.
Researcher Affiliation Academia Peter Nickl peter.nickl@riken.jp Lu Xu lu.xu.sw@riken.jp Dharmesh Tailor d.v.tailor@uva.nl Thomas Möllenhoff thomas.moellenhoff@riken.jp Mohammad Emtiyaz Khan emtiyaz.khan@riken.jp RIKEN Center for AI Project, Tokyo, Japan. University of Amsterdam, Amsterdam, Netherlands.
Pseudocode No The information is insufficient. The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes All details of the experimental setup are included in App. I and the code is available at https://github.com/team-approx-bayes/memory-perturbation.
Open Datasets Yes We show results for three datasets, each using a different architecture but all trained using SGD. To estimate the Hessian H and compute vi = fi(θ ) H 1 fi(θ ), we use a Kronecker-factored (K-FAC) approximation implemented in the laplace [11] and ASDL [39] packages.
Dataset Splits Yes The approximation eliminates the need to train N models to perform CV, rather just uses eit and vit which are extremely cheap to compute within algorithms such as ON, RMSprop, and SGD. Leave-group-out (LGO) estimates can also be built, for example, by using Eq. 14, which enables us to understand the effect of leaving out a big chunk of training data, for example, an entire class for classification.
Hardware Specification No The information is insufficient. The paper does not specify the exact hardware (e.g., GPU/CPU models, specific cloud instances) used for running the experiments.
Software Dependencies No The information is insufficient. The paper mentions using 'laplace [11] and ASDL [39] packages' but does not specify their version numbers.
Experiment Setup Yes All details of the experimental setup are included in App. I and the code is available at https://github.com/team-approx-bayes/memory-perturbation.