CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks
Authors: Xuanli He, Qiongkai Xu, Yi Zeng, Lingjuan Lyu, Fangzhao Wu, Jiwei Li, Ruoxi Jia
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we observe that high-order conditions lead to an exponential growth of suspicious (unused) watermarks, making our crafted watermarks more stealthy. In addition, CATER can effectively identify IP infringement under architectural mismatch and cross-domain imitation attacks, with negligible impairments on the generation quality of victim APIs. We envision our work as a milestone for stealthily protecting the IP of text generation APIs. ... 4 Experiments Text Generation Tasks. We examine two widespread text generation tasks: machine translation and document summarization, which have been successfully deployed as commercial APIs.67. To demonstrate the generality of CATER, we also apply it to two more text generation tasks: i) text simplification and ii) paraphrase generation. |
| Researcher Affiliation | Collaboration | Xuanli He University College London zodiac.he@gmail.com Qiongkai Xu University of Melbourne qiongkai.xu@unimelb.edu.au Yi Zeng Virginia Tech yizeng@vt.edu Lingjuan Lyu Sony AI Lingjuan.Lv@sony.com Fangzhao Wu Microsoft Research Asia fangzwu@microsoft.com Jiwei Li Shannon.AI, Zhejiang University jiwei_li@shannonai.com Ruoxi Jia Virginia Tech ruoxijia@vt.edu |
| Pseudocode | No | The paper presents mathematical formulations and optimization problems but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and data are available at: https://github.com/xlhex/cater_neurips.git |
| Open Datasets | Yes | Machine Translation: We consider WMT14 German (De) English (En) translation [2] as the testbed. We follow the official split: train (4.5M) / dev (3,000) / test (3,003). ... Document summarization: CNN/DM [14] utilizes informative headlines as summaries of news articles. We reuse the dataset preprocessed by See et al. [38] with a partition of train/dev/test as 287K / 13K / 11K. |
| Dataset Splits | Yes | Machine Translation: We consider WMT14 German (De) English (En) translation [2] as the testbed. We follow the official split: train (4.5M) / dev (3,000) / test (3,003). ... Document summarization: CNN/DM [14] utilizes informative headlines as summaries of news articles. We reuse the dataset preprocessed by See et al. [38] with a partition of train/dev/test as 287K / 13K / 11K. |
| Hardware Specification | No | The paper does not specify the hardware used for experiments, such as GPU models, CPU types, or cloud computing instance details. It only mentions 'Transformer-base' models. |
| Software Dependencies | No | The paper mentions software like Gurobi, Moses, BART, and Transformer-base models, but it does not specify concrete version numbers for any of these, which is required for reproducibility of ancillary software. |
| Experiment Setup | Yes | We use 32K and 16K BPE vocabulary [39] for experiments on WMT14 and CNN/DM, respectively. ... We set the size of synonyms to 2 and vary this value in Appendix F.1. The detailed construction of watermarks and approximation of p in Equation 1 for CATER is provided in Appendix D. ... The training details are summarized in Appendix D. |