DALD: Improving Logits-based Detector without Logits from Black-box LLMs

Authors: Cong Zeng, Shengkun Tang, Xianjun Yang, Yuanzhou Chen, Yiyou Sun, Zhiqiang Xu, Yao Li, Haifeng Chen, Wei Cheng, Dongkuan (DK) Xu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments validate that our methodology reliably secures high detection precision for LLM-generated text and effectively detects text from diverse model origins through a singular detector. Our approach performs SOTA in black-box settings on different advanced closed-source and open-source models.
Researcher Affiliation Collaboration MBZUAI1 University of California, Santa Barbara2 University of California, Los Angeles3 NEC Labs America4 University of North Carolina, Chapel Hill5 NC State University6
Pseudocode No The paper describes the methodology and process in prose and uses figures (e.g., Figure 3 for framework overview) and equations, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code and data are released at https://github.com/cong-zeng/DALD
Open Datasets Yes We follow Fast-Detect GPT using four datasets in the black-box detection evaluation, including Xsum[52], Writing Prompts[53], WMT-2016[54] and Pub Med QA[55]. Our training datasets are collected from the open-source datasets, Wild Chat[59] for GPT-3.5 and GPT-4.
Dataset Splits No The paper specifies training and test sets but does not explicitly mention a separate validation set or details about its use for hyperparameter tuning or early stopping. It states, 'We do not tune the hyperparameters carefully.'
Hardware Specification Yes For training time, our method finetunes Llama-2-7B with 5K samples on 4 A6000.
Software Dependencies No The paper mentions using PyTorch and Low Rank Adaptation (Lo RA) but does not provide specific version numbers for these software components or any other key libraries.
Experiment Setup Yes For Lo RA hyper-parameters, we utilize 16 as the Lo RA rank and set lora_alpha as 32. Dropout is set as 0.05. For training hyperparameters, we set 512 as the max length for texts from GPT-4 and GPT-3.5 models while it is 2048 for texts from Claude-3. We finetune the surrogate model with a learning rate of 1e-4. The batch size is set as 1 per device with gradient accumulation per 4 steps.