reproducibilityindex.ai

Provable Robust Watermarking for AI-Generated Text

Authors: Xuandong Zhao, Prabhanjan Vijendra Ananth, Lei Li, Yu-Xiang Wang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three varying LLMs and two datasets verify that our UNIGRAM-WATERMARK achieves superior detection accuracy and comparable generation quality in perplexity, thus promoting the responsible use of LLMs. Code is available at https://github. com/Xuandong Zhao/Unigram-Watermark. In this section, we aim to conduct experiments to evaluate watermark detection performance, watermarked text quality, and robustness against attacks compared to the baseline. Additional experiment results including different parameters, white-box attacks, scaled language models, etc. are deferred to Appendix B.
Researcher Affiliation	Academia	Xuandong Zhao Prabhanjan Ananth Lei Li Yu-Xiang Wang UC Santa Barbara {xuandongzhao,prabhanjan,leili,yuxiangw}@cs.ucsb.edu
Pseudocode	Yes	Pseudocodes of our approach Watermark and Detect are provided in Algorithm 1 and 2.
Open Source Code	Yes	Code is available at https://github. com/Xuandong Zhao/Unigram-Watermark.
Open Datasets	Yes	We utilize two long-form text datasets: Open Gen and LFQA. Open Gen, collected by Krishna et al. (2023), consists of 3K two-sentence chunks sampled from the validation split of Wiki Text-103 (Merity et al., 2017). LFQA is a long-form question-answering dataset created by Krishna et al. (2023) by scraping questions from Reddit, posted between July and December 2021, across six domains.
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits. It mentions using the "validation split of Wiki Text-103" for collecting prompts but does not specify how its own generated data was split for training, validation, or testing of their watermark detection model itself.
Hardware Specification	Yes	The experiments are conducted on Nvidia A100 GPUs.
Software Dependencies	No	The paper mentions using "Huggingface library (Wolf et al., 2019)" but does not specify a version number for this or any other software dependency necessary for reproduction. It also mentions GPT3 (text-davinci-003) for perplexity evaluation but without versioning.
Experiment Setup	Yes	We use a watermark strength of δ = 2.0 and a green list ratio of γ = 0.5. We also use different watermark keys k for different models. Nucleus Sampling (Holtzman et al., 2020) is employed as the default decoding algorithm to introduce randomness while maintaining human-like text output.