Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Uncovering Latent Memories in Large Language Models

Authors: Sunny Duan, Mikail Khona, Abhiram Iyer, Rylan Schaeffer, Ila Fiete

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We systematically evaluate how the statistical characteristics of training data, specifically sequence complexity and repetition, influence the likelihood of memorization in language models. Our findings demonstrate that the probability of memorizing a sequence scales logarithmically with its repetition in the training data as well as the complexity of the sequence under consideration. These results extends prior work characterizing which sequences become memorized (Prashanth et al., 2024; Tirumala et al., 2022).
Researcher Affiliation Academia Sunny Duan Brain and Cognitive Sciences MIT EMAIL Mikail Khona Physics MIT EMAIL Abhiram Iyer EECS MIT EMAIL Rylan Schaeffer Computer Science Stanford University EMAIL Ila Rani Fiete Brain and Cognitive Sciences MIT EMAIL
Pseudocode No The paper describes methodologies in text and through diagrams (e.g., Figure 4a for perturbation process) but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes All code used for this project is available at https://github.com/sunnyddelight/ latent_memorization.
Open Datasets Yes In this study, we largely focused on the Pythia 1B language model (Biderman et al., 2023a), which was trained on 300B tokens from the Pile (Gao et al., 2020). For selected experiments, to ensure our results hold on other language models, we reproduced our results using a second model, Amber-7B (Liu et al., 2023). We selected these two models as they were large, high performing models complete with fully reproducible data sequences and frequent checkpoints.
Dataset Splits No The paper analyzes memorization in pre-trained language models (Pythia 1B and Amber-7B) using their original training corpora (Pile and Amber dataset). The experiments involve selecting specific sequences from these corpora for analysis and observing their memorization dynamics across model checkpoints, rather than defining new training, validation, and test splits for their own experimental setup.
Hardware Specification Yes All experiments were run on a cluster with access to 16 concurrent a100 GPUs. All of the language models were run using a single GPU and multiple GPUs were used to parallelize the experiments in order to speed up progress.
Software Dependencies No The paper mentions using the 'zlib package in Python' and refers to code from the Pythia project and the Amber model, but it does not provide specific version numbers for these or any other ancillary software dependencies used in their experiments.
Experiment Setup Yes Throughout this study, we set k = 32 and compare the continuation of the model with the original sequence by computing the Levenshtein distance between the next 64 tokens. [...] all experiments were run with the models run with half precision (float16) and temperature 0. [...] We randomly perturb the model weights by adding a small amount of random gaussian noise (of magnitude 2 10 3) to each of the weight parameters. We repeat this process 200 times [...] We tried four different temperatures and sampled 200 different sequence continuations from each of the temperatures. [...] In the rest of this work, we restrict our analys to sequences which are presented once and have high complexity (> 0.8).