Towards Codable Watermarking for Injecting Multi-Bits Information to LLMs

Authors: Lean Wang, Wenkai Yang, Deli Chen, Hao Zhou, Yankai Lin, Fandong Meng, Jie Zhou, Xu Sun

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results show that our method outperforms the baseline under comprehensive evaluation.
Researcher Affiliation Collaboration National Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University Gaoling School of Artificial Intelligence, Renmin University of China Pattern Recognition Center, We Chat AI, Tencent Inc., China Deep Seek AI
Pseudocode Yes Algorithm 1: A General Message Encoding Framework for A Settled Pw
Open Source Code Yes Our code is available at https: //github.com/lancopku/codable-watermarking-for-llm.
Open Datasets Yes Our prompt inputs are derived from the news-like subset of the C4 dataset (Raffel et al., 2019).
Dataset Splits No The paper describes extracting 500 prompt inputs from the C4 dataset for generation, but it does not specify explicit training, validation, or test dataset splits of the C4 dataset itself for model evaluation or reproduction of data partitioning.
Hardware Specification No The paper mentions the use of specific LLM models (OPT-1.3B, LLaMA-7/13B, GPT2) for experiments and perplexity calculation but does not specify the underlying hardware (e.g., GPU models, CPU types, or cloud instances) on which these models were run.
Software Dependencies No The paper mentions specific language models and tools like OPT-1.3B, LLaMA-7/13B, GPT2, and RoBERTa-Large, but it does not specify version numbers for these software components or any other ancillary software dependencies like Python or PyTorch versions.
Experiment Setup Yes In each experiment, we extract 500 prompt inputs and truncate them to a uniform length of 300 tokens. The language model is then requested to generate 200 tokens via a 4-way beam search. To mitigate repetition in the generated text, we implement a repetition penalty of 1.5. ... Additional hyper-parameters used in Balance-Marking are set to the following: A = 100, Lprefix = 10, σ = 0.5 and M = {0, 1, ..., 2^20 - 1}.