Counterfactual Memorization in Neural Language Models
Authors: Chiyuan Zhang, Daphne Ippolito, Katherine Lee, Matthew Jagielski, Florian Tramer, Nicholas Carlini
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We estimate and analyze counterfactual memorization of training examples in three standard text datasets: Real News [Zellers et al., 2019], C4 [Raffel et al., 2020a] and Wiki40B:en [Guo et al., 2020]. Unless otherwise specified, we use Transformer-based language models [Vaswani et al., 2017] equivalent to (decoder only) T5-base [Raffel et al., 2020b] with 112M parameters. |
| Researcher Affiliation | Collaboration | Chiyuan Zhang Google Research chiyuan@google.com Daphne Ippolito Carnegie Mellon University daphnei@cmu.edu Katherine Lee Google Deep Mind katherinelee@google.com Matthew Jagielski Google Deep Mind jagielski@google.com Florian Tramèr ETH Zürich florian.tramer@inf.ethz.ch Nicholas Carlini Google Deep Mind ncarlini@google.com |
| Pseudocode | No | The paper does not contain a figure, block, or section explicitly labeled "Pseudocode" or "Algorithm". |
| Open Source Code | No | The paper states: "Our experiments are implemented using JAX [Bradbury et al., 2018] and Flax [Heek et al., 2020], both open sourced library under the Apache-2.0 license." and "In the study of influence on generated texts, we use the publicly released generations from the Grover models [Zellers et al., 2019], available at their open source code repository, under the Apache-2.0 license." These refer to third-party libraries and models used, not the authors' own implementation code for their specific methodology. |
| Open Datasets | Yes | We estimate and analyze counterfactual memorization of training examples in three standard text datasets: Real News [Zellers et al., 2019], C4 [Raffel et al., 2020a] and Wiki40B:en [Guo et al., 2020]. |
| Dataset Splits | No | For C4/Real News/Wiki40B:en, respectively, our models converge to an average per-token accuracy of 44.21%/47.59%/66.35% on the subsampled training set, and 27.90%/31.09%/49.55% on the validation set. The paper mentions a "validation set" but does not specify the exact split percentages or sample counts to reproduce the data partitioning. |
| Hardware Specification | No | The paper states: "We run the experiments using our internal cluster. The majority of the compute is consumed by model training. In this paper, we use standard training setup for transformer based neural language models, which could run on single node machines with one or multiple GPUs." This description is too general and lacks specific hardware details such as GPU models, CPU models, or memory specifications. |
| Software Dependencies | Yes | Our experiments are implemented using JAX [Bradbury et al., 2018] and Flax [Heek et al., 2020], both open sourced library under the Apache-2.0 license. |
| Experiment Setup | Yes | We train each model for 60 epochs using the Adam optimizer [Kingma and Ba, 2015] with learning rate 0.1 and weight decay 10^-5. |