reproducibilityindex.ai

Quantifying and Analyzing Entity-Level Memorization in Large Language Models

Authors: Zhenhong Zhou, Jiuyang Xiang, Chaomeng Chen, Sen Su

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments based on the proposed, probing language models ability to reconstruct sensitive entities under different settings.
Researcher Affiliation	Academia	Zhenhong Zhou1, Jiuyang Xiang2, Chaomeng Chen1, Sen Su1* 1Beijing University of Posts and Telecommunications 2University of Michigan
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	The GPT-Neo model family (Black et al. 2021) includes a set of causal language models (CLMs) that are trained on The Pile datasets (Gao et al. 2020) and available in four sizes... The Enron email dataset (Klimt and Yang 2004), a subset of The Pile dataset, encompasses over 500,000 emails from approximately 150 users of the Enron Corporation.
Dataset Splits	No	The paper describes the datasets used (The Pile, Enron) and how entities were extracted and processed, but it does not specify explicit training, validation, or test dataset splits in terms of percentages, sample counts, or references to predefined splits used for reproducibility.
Hardware Specification	No	The paper does not specify any particular hardware used for running the experiments (e.g., CPU, GPU models, memory, or cloud instance types).
Software Dependencies	No	The paper mentions using Python for implementation (implied by typical ML research) but does not provide specific version numbers for Python, PyTorch, TensorFlow, or any other libraries, frameworks, or solvers used.
Experiment Setup	Yes	In our experimental setup, we use the greedy decoding strategy by default to generate the output with the minimum perplexity (PPL), which is then utilized for evaluating the model s entity memorization capabilities... The prefix length is a crucial parameter of soft prompts.