reproducibilityindex.ai

Watermarking Makes Language Models Radioactive

Authors: Tom Sander, Pierre Fernandez, Alain Durmus, Matthijs Douze, Teddy Furon

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate the radioactivity of text generated by large language models (LLM), i.e., whether it is possible to detect that such synthetic input was used to train a subsequent LLM. Our new methods, specialized for radioactivity, detects with a provable confidence weak residuals of the watermark signal in the fine-tuned LLM.
Researcher Affiliation	Collaboration	Meta FAIR & École polytechnique Pierre Fernandez Meta FAIR & Inria Rennes Alain Durmus École polytechnique Matthijs Douze Meta FAIR Teddy Furon Inria Rennes
Pseudocode	No	The paper describes methods and procedures in narrative text and figures but does not contain a formally structured or labeled pseudocode or algorithm block.
Open Source Code	Yes	Radioactivity detection code is available at https://github.com/facebookresearch/radioactive-watermark
Open Datasets	Yes	The second fine-tuning is done with the setup presented in Sec. 5, with ρ=10% of watermarked data, and the second on OASST1 [Köpf et al., 2024].
Dataset Splits	No	The paper describes the datasets used for fine-tuning and evaluation benchmarks, but it does not specify explicit training/validation/test splits for its own generated instruction/answer pairs or the benchmarks in a way that allows reproduction of data partitioning for these specific experiments.
Hardware Specification	Yes	For our experiments, we utilized an internal cluster. ... on a single node equipped with 8 V100 GPUs. ... on a single V100 GPU.
Software Dependencies	No	The paper mentions optimizers and sampling methods, but it does not provide specific version numbers for software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, or scikit-learn versions) required to replicate the experiment.
Experiment Setup	Yes	We use Adam W [Loshchilov and Hutter, 2017a] for 3000 steps, with a batch size of 8, a learning rate of 10 5 and a context size of 2048 tokens (which results in 3 training epochs). The learning rate follows a cosine annealing schedule [Loshchilov and Hutter, 2017b] with 100 warmup steps. ... logit bias δ = 3.0, proportion of greenlist tokens γ = 0.25, and k = 2. In both cases, we use nucleus sampling [Holtzman et al., 2019] with p = 0.95 and T = 0.8.