reproducibilityindex.ai

An Unforgeable Publicly Verifiable Watermark for Large Language Models

Authors: Aiwei Liu, Leyi Pan, Xuming Hu, Shuang Li, Lijie Wen, Irwin King, Philip S. Yu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that our algorithm attains high detection accuracy and computational efficiency through neural networks.
Researcher Affiliation	Academia	1Tsinghua University 2The Chinese University of Hong Kong 3The Hong Kong University of Science and Technology (Guangzhou) 4University of Illinois at Chicago
Pseudocode	Yes	Algorithm 1 Watermark Generation Step
Open Source Code	Yes	Our code is available at https://github.com/THU-BPM/unforgeable_watermark.
Open Datasets	Yes	We select the C4(Raffel et al., 2020) and Dbpedia Class (Raffel et al., 2020) datasets
Dataset Splits	No	The paper describes the data used for training the watermark generation and detection networks and for evaluating the main results (500 human, 500 generated texts), but it does not specify explicit validation splits or quantities for the overall experimental setup.
Hardware Specification	Yes	All timings were recorded on a single V100 32G GPU.
Software Dependencies	No	The paper mentions using the 'Adam optimizer' but does not specify version numbers for any software dependencies like programming languages (e.g., Python), libraries (e.g., PyTorch, TensorFlow), or other tools.
Experiment Setup	Yes	The default hyperparameters are configured as follows: watermark token ratio γ of 0.5, window size w of 5, five token embedding layers, and δ = 2 for the generator. The detector comprises two LSTM layers as well as the same token embedding layers. For decoding, Top-K employs K = 20 and Beam Search utilizes a beam width of 8. Both networks adopt a learning rate of 0.01, optimized via the Adam optimizer (Kingma & Ba, 2014). Due to the varying sensitivity of language models to the watermark, for ease of comparison, the detection z-score thresholds for GPT-2, OPT 1.3B, and LLa MA 7B are set to 1, 1 and 3, respectively.