Counterfactual Memorization in Neural Language Models

Authors: Chiyuan Zhang, Daphne Ippolito, Katherine Lee, Matthew Jagielski, Florian Tramer, Nicholas Carlini

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We estimate and analyze counterfactual memorization of training examples in three standard text datasets: Real News [Zellers et al., 2019], C4 [Raffel et al., 2020a] and Wiki40B:en [Guo et al., 2020]. Unless otherwise specified, we use Transformer-based language models [Vaswani et al., 2017] equivalent to (decoder only) T5-base [Raffel et al., 2020b] with 112M parameters.
Researcher Affiliation Collaboration Chiyuan Zhang Google Research chiyuan@google.com Daphne Ippolito Carnegie Mellon University daphnei@cmu.edu Katherine Lee Google Deep Mind katherinelee@google.com Matthew Jagielski Google Deep Mind jagielski@google.com Florian Tramèr ETH Zürich florian.tramer@inf.ethz.ch Nicholas Carlini Google Deep Mind ncarlini@google.com
Pseudocode No The paper does not contain a figure, block, or section explicitly labeled "Pseudocode" or "Algorithm".
Open Source Code No The paper states: "Our experiments are implemented using JAX [Bradbury et al., 2018] and Flax [Heek et al., 2020], both open sourced library under the Apache-2.0 license." and "In the study of influence on generated texts, we use the publicly released generations from the Grover models [Zellers et al., 2019], available at their open source code repository, under the Apache-2.0 license." These refer to third-party libraries and models used, not the authors' own implementation code for their specific methodology.
Open Datasets Yes We estimate and analyze counterfactual memorization of training examples in three standard text datasets: Real News [Zellers et al., 2019], C4 [Raffel et al., 2020a] and Wiki40B:en [Guo et al., 2020].
Dataset Splits No For C4/Real News/Wiki40B:en, respectively, our models converge to an average per-token accuracy of 44.21%/47.59%/66.35% on the subsampled training set, and 27.90%/31.09%/49.55% on the validation set. The paper mentions a "validation set" but does not specify the exact split percentages or sample counts to reproduce the data partitioning.
Hardware Specification No The paper states: "We run the experiments using our internal cluster. The majority of the compute is consumed by model training. In this paper, we use standard training setup for transformer based neural language models, which could run on single node machines with one or multiple GPUs." This description is too general and lacks specific hardware details such as GPU models, CPU models, or memory specifications.
Software Dependencies Yes Our experiments are implemented using JAX [Bradbury et al., 2018] and Flax [Heek et al., 2020], both open sourced library under the Apache-2.0 license.
Experiment Setup Yes We train each model for 60 epochs using the Adam optimizer [Kingma and Ba, 2015] with learning rate 0.1 and weight decay 10^-5.