Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A Survey on the Possibilities & Impossibilities of AI-generated Text Detection

Authors: Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, Dinesh Manocha, Amrit Bedi

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this survey, we aim to provide a concise categorization and overview of current work encompassing both the prospects and the limitations of AI-generated text detection. ... Specifically on XSum dataset, when samples are paraphrased using a Open AI GPT-3.5-Turbo API (which is different from the paraphraser using during training RADAR), RADAR improves detection performance by 16.6% and 59.5% as compared to a Roberta-based detector fine-tuned on Web Text (Gokaslan et al., 2019) and Detect GPT (Mitchell et al., 2023). Table 1: Evaluating popular language models using state-of-art Post-hoc detectors on Xsum, SQuAD, and WP dataset. The table is motivated from Hu et al. (2023). The values are obtained by reproducing the results in (Hu et al., 2023).
Researcher Affiliation	Academia	Soumya Suvra Ghosal* EMAIL University of Maryland, College Park, MD, USA Souradip Chakraborty* EMAIL University of Maryland, College Park, MD, USA Jonas Geiping EMAIL University of Maryland, College Park, MD, USA Furong Huang EMAIL University of Maryland, College Park, MD, USA Dinesh Manocha EMAIL University of Maryland, College Park, MD, USA Amrit Singh Bedi EMAIL University of Maryland, College Park, MD, USA
Pseudocode	No	The paper describes various methods, such as the watermarking operation (Section 4.1.2), in numbered steps; however, these are presented as descriptive text rather than formal pseudocode blocks or algorithms.
Open Source Code	No	The paper is a survey and does not introduce a new methodology for which dedicated source code would typically be released. It does not contain any explicit statements about code availability or links to code repositories for the work described in this paper.
Open Datasets	No	This paper is a survey and does not conduct its own experiments requiring a dataset. While it references various datasets used in the reviewed literature (e.g., Xsum, SQuAD, Web Text), it does not provide access information for a dataset used in its own analysis or methodology.
Dataset Splits	No	This paper is a survey and does not present its own experimental results. Therefore, it does not provide specific dataset split information for data partitioning.
Hardware Specification	No	This paper is a survey and does not describe its own experimental setup or computations that would require specific hardware specifications.
Software Dependencies	No	This paper is a survey and does not describe its own experimental implementation or methodology that would require specific ancillary software details with version numbers.
Experiment Setup	No	This paper is a survey and does not describe its own experimental setup, hyperparameters, or system-level training settings.