Raidar: geneRative AI Detection viA Rewriting
Authors: Chengzhi Mao, Carl Vondrick, Hao Wang, Junfeng Yang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Visualizations, empirical experiments show that our simple rewriting-based algorithm Raidar significantly improves detection for several established paragraph-level detection benchmarks. |
| Researcher Affiliation | Collaboration | Chengzhi Mao1 & Carl Vondrick1 & Hao Wang2 & Junfeng Yang1 Columbia University1 Rutgers University2 |
| Pseudocode | Yes | Algorithm 1 Detecting LLM Generated Content via Output Invariance |
| Open Source Code | Yes | Our data and code is available at https://github.com/cvlab-columbia/Raidar LLMDetect.git. |
| Open Datasets | Yes | Creative Writing Dataset is a language dataset based on the subreddit Writing Prompts, which is creative writing by a community based on the prompts. We use the dataset generated by Verma et al. (2023). |
| Dataset Splits | Yes | The training and testing domain for Table 2. For all experiments in Table 2, we use logistic regression, and use the same source and target for invariance, equivariance, and uncertainty. For News, we train on Creative Writing and test on News. For Creative Writing, we train on News and test on Creative Writing. FOr Student Essay, we train on News, and test on student Essay. |
| Hardware Specification | No | The paper mentions interacting with LLM APIs (e.g., GPT-3.5-Turbo) but does not specify the hardware used for running their own experiments and models. |
| Software Dependencies | No | The paper mentions using Logistic Regression and XGBoost for classification but does not specify the version numbers of the software libraries or programming languages used. |
| Experiment Setup | Yes | We use GPT-3.5-Turbo as the LLM to rewrite the input text. Once we obtain the editing distance feature from the rewriting, we use Logistic Regression (Berkson, 1944) or XGBoost (Chen & Guestrin, 2016) to perform the binary classification. |