Optimizing Watermarks for Large Language Models
Authors: Bram Wouters
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper introduces a systematic approach to this trade-off in terms of a multi-objective optimization problem. For a large class of robust, efficient watermarks, the associated Pareto optimal solutions are identified and shown to outperform existing robust, efficient watermarks. ... Our contribution. For a large class of robust, efficient watermarks based on the green-red split of the vocabulary, we translate the test-text trade-off into a multi-objective optimization problem and identify the associated Pareto optimal solutions. We empirically validate the optimality of the solutions and show that they outperform existing proposals of robust, efficient watermarks (Kirchenbauer et al., 2023; Kuditipudi et al., 2024; Wu et al., 2023) with respect to the test-text trade-off. ... 4. Experiments |
| Researcher Affiliation | Academia | 1University of Amsterdam. Correspondence to: Bram Wouters <b.m.wouters@uva.nl>. |
| Pseudocode | No | The paper contains mathematical equations and descriptions of functions but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | *Code is available at https://github.com/brwo/optimizing-watermarks. |
| Open Datasets | Yes | From the C4 dataset (Raffel et al., 2020) a sample of 500 (news) articles is drawn randomly. |
| Dataset Splits | No | The paper describes how texts are generated for evaluation purposes using pre-trained LLMs, but it does not specify training/validation/test splits for a model trained within the scope of this paper. |
| Hardware Specification | No | The paper mentions the use of specific LLMs (e.g., OPT-1.3B, BART-large) but does not specify the hardware (e.g., GPU, CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions the 'Huggingface library (Wolf et al., 2020)' but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | Sampling from the LLM takes place with a temperature of 1.0. In order to generate sequences of a fixed length T = 30, the EOS token is suppressed. ... For TS we use the BART-large model (Liu et al., 2020)... For MT we use the WMT 2016 dataset and use the Multilingual BART model (Liu et al., 2020)... We used the default sampling strategy: beam search with 4 beams. |