reproducibilityindex.ai

RADAR: Robust AI-Text Detection via Adversarial Learning

Authors: Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluated with 8 different LLMs (Pythia, Dolly 2.0, Palmyra, Camel, GPT-J, Dolly 1.0, LLa MA, and Vicuna) across 4 datasets, experimental results show that RADAR significantly outperforms existing AI-text detection methods, especially when paraphrasing is in place.
Researcher Affiliation	Collaboration	Xiaomeng Hu The Chinese University of Hong Kong Sha Tin, Hong Kong xmhu23@cse.cuhk.edu.hk Pin-Yu Chen IBM Research New York, USA pin-yu.chen@ibm.com Tsung-Yi Ho The Chinese University of Hong Kong Sha Tin, Hong Kong tyho@cse.cuhk.edu.hk
Pseudocode	Yes	Algorithm 1 RADAR: Robust AI-Text Detection via Adversarial Learning
Open Source Code	No	Project Page and Demos: https://radar.vizhub.ai IBM demo is developed by Hendrik Strobelt and Benjamin Hoover at IBM Research Hugging Face demo is developed by Xiaomeng Hu
Open Datasets	Yes	For training, we sampled 160K documents from Web Text [9] to build the human-text corpus H. ... [9] Aaron Gokaslan, Vanya Cohen, Ellie Pavlick, and Stefanie Tellex. Openwebtext corpus. http://Skylion007.github.io/Open Web Text Corpus, 2019.
Dataset Splits	Yes	During training, we use the test set of Web Text as the validation dataset to estimate RADAR s performance. ... Table A1: Summary of the used human-text corpora Phase Source Dataset Dataset Key Sample Counts ... Validation Web Text-test text 4007
Hardware Specification	Yes	Experiments were run on 2 GPUS (NVIDIA Tesla V100 32GB).
Software Dependencies	No	No specific version numbers for key software components (e.g., Python, PyTorch, TensorFlow, or specific library versions) were found.
Experiment Setup	Yes	During training, we set the batch size to 32 and train the models until the validation loss converges. We use Adam W as the optimizer with the initial learning rate set to 1e-5 and use linear decay for both Gσ and Dϕ. We set λ = 0.5 for sample balancing in Eq. 3 and set γ = 0.01 in Eq. 2.