Measures of Information Reflect Memorization Patterns

Authors: Rachit Bansal, Danish Pruthi, Yonatan Belinkov

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we hypothesize and subsequently show that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization. We quantify the diversity in the neural activations through information-theoretic measures and find support for our hypothesis in experiments spanning several natural language and vision tasks.
Researcher Affiliation Collaboration Rachit Bansal X Delhi Technological University racbansa@gmail.com Danish Pruthi Y Amazon Web Services danish@hey.com Yonatan Belinkov Z Technion Israel Institute of Technology belinkov@technion.ac.il
Pseudocode Yes Algorithm 1 Computation of information measures. Algorithmic procedures ENTROPY and MI are specified by algorithms 2 and 3 in appendix A.
Open Source Code Yes The associated code and other resources for this work are available at https://information-measures.cs.technion.ac.il.
Open Datasets Yes We compare networks with varying levels of heuristic ( 3) or example-level ( 4) memorization across a variety of settings: synthetic setups based on the IMDb (Maas et al., 2011) and MNIST (Lecun et al., 1998) datasets for both memorization types, as well as naturally occurring scenarios of gender bias on Bias-in-Bios (De-Arteaga et al., 2019) and OOD image classification on NICO (Zhang et al., 2022).
Dataset Splits Yes The same correlations with an artifact do not hold in the validation sets. ... We then analyze these trained networks on the original validation set. ... We train Res Net-18 (He et al., 2015) networks for the two sets and evaluate them on the common NICO++ evaluation set, balanced across all common contexts.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running its experiments. It only mentions general network types like 'multi-layer perceptron (MLP)' and 'Distil BERT-base'.
Software Dependencies No The paper mentions models and frameworks used (e.g., 'Distil BERT-base models (Sanh et al., 2019)', 'Ro BERTa-base (Liu et al., 2019)', 'Res Net-18 (He et al., 2015)'), but does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes We synthetically introduce spurious artifacts in the training examples such that they co-occur with target labels. ... We consider a parameter α that controls the fraction of the training examples for which the spurious correlation holds true. ... The considered values of α and other details for this setup are given in appendix B.1. ... The full set of adjectives considered and further details are outlined in appendix B.2. ... More details are given in appendix B.3.