Segmenting Watermarked Texts From Language Models
Authors: Xingchi Li, Guanxun Li, Xianyang Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate our technique, we apply it to texts generated by several language models with prompts extracted from Google s C4 dataset and obtain encouraging numerical results.1 |
| Researcher Affiliation | Academia | Xingchi Li Department of Statistics Texas A&M University College Station, TX 77843 anthony.li@stat.tamu.edu Guanxun Li Department of Statistics Beijing Normal University at Zhuhai Zhuhai, Guangdong 519087 guanxun@bnu.edu.cn Xianyang Zhang Department of Statistics Texas A&M University College Station, TX 77843 zhangxiany@stat.tamu.edu |
| Pseudocode | Yes | Algorithm 1 Seed BS-NOT for change point detection in potentially partially watermarked texts |
| Open Source Code | Yes | We release all code publicly at https://github.com/doccstat/llm-watermark-cpd. |
| Open Datasets | Yes | We conduct extensive real-data-based experiments following a similar empirical setting in Kirchenbauer et al. [2023a], where we generate watermarked text based on the prompts sampled from the news-like subset of the colossal clean crawled corpus (C4) dataset [Raffel et al., 2020]. |
| Dataset Splits | No | The paper does not provide specific details on validation dataset splits, percentages, or methodology. While it mentions validating their *technique* generally, it does not specify a distinct 'validation set' split from the C4 dataset or other experimental data. |
| Hardware Specification | No | The paper mentions "Arseven Computing Cluster at the Department of Statistics, Texas A&M University" but does not specify any particular CPU, GPU models, or detailed hardware specifications. |
| Software Dependencies | No | The paper mentions using specific LLM models (e.g., openai-community/gpt2, facebook/opt-1.3b, Meta-Llama-3-8B) and GNU Parallel, but it does not provide specific version numbers for these or any other software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | We fix the length of text m = 500, the size of sliding window B = 20, and the block size used in the block bootstrap-based test B = 20. ... In Algorithm 1, we set the decay parameter a = 2 and the minimum length of the intervals generated by Seed BS to be 50 such that the block bootstrapbased test is meaningful, and the threshold ΞΆ {0.05, 0.01, 0.005, 0.001}. |