reproducibilityindex.ai

On the Learnability of Watermarks for Language Models

Authors: Chenchen Gu, Xiang Lisa Li, Percy Liang, Tatsunori Hashimoto

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test our approach on three decoding-based watermarking strategies and various hyperparameter settings, finding that models can learn to generate watermarked text with high detectability.
Researcher Affiliation	Academia	Chenchen Gu, Xiang Lisa Li, Percy Liang, Tatsunori Hashimoto Stanford University {cygu, xlisali, thashim}@stanford.edu, pliang@cs.stanford.edu
Pseudocode	No	The paper describes its methods (logit-based and sampling-based watermark distillation) in prose and with mathematical equations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	In addition, we release code and scripts to reproduce experiments at https://github. com/chenchenygu/watermark-learnability, along with trained model weights.
Open Datasets	Yes	For logit-based watermark distillation, we use Llama 2 7B (Touvron et al., 2023) as both the teacher and student models... We distill using a subset of Open Web Text (Gokaslan et al., 2019) for 5,000 steps... We evaluate on generations prompted by prefixes from the Real News Like subset of the C4 dataset (Raffel et al., 2020).
Dataset Splits	Yes	We distill using a subset of Open Web Text (Gokaslan et al., 2019) for 5,000 steps with a batch size of 64 sequences, sequence length of 512 tokens,8 maximal learning rate of 1e-5, and cosine learning rate decay with a linear warmup. ... We evaluate on generations prompted by prefixes from the Real News Like subset of the C4 dataset (Raffel et al., 2020). For each decoding-based watermarking strategy and distilled model, we generate 5,000 200-token completions from 50-token prompts from the validation split.
Hardware Specification	Yes	Each training run took approximately 6 hours on 4 NVIDIA A100 80GB GPUs. (Appendix E.1) ... Each training run took approximately 3 hours on 1 NVIDIA A100 80GB GPU. (Appendix F)
Software Dependencies	No	The paper mentions models (e.g., 'Llama 2 7B', 'Pythia 1.4B') and optimizers ('Adam W optimizer') but does not specify version numbers for key software libraries or dependencies (e.g., 'PyTorch 1.x', 'Hugging Face Transformers 4.x').
Experiment Setup	Yes	We distill using a subset of Open Web Text (Gokaslan et al., 2019) for 5,000 steps with a batch size of 64 sequences, sequence length of 512 tokens,8 maximal learning rate of 1e-5, and cosine learning rate decay with a linear warmup.