reproducibilityindex.ai

Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts

Authors: Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Sergey Nikolenko, Evgeny Burnaev, Serguei Barannikov, Irina Piontkovskaya

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experiments
Researcher Affiliation	Collaboration	1Skolkovo Institute of Science and Technology, Russia; 2AI Foundation and Algorithm Lab, Russia; 3Artificial Intelligence Research Institute (AIRI), Russia;4CNRS, Université Paris Cité, France; 5St. Petersburg Department of the Steklov Institute of Mathematics, Russia
Pseudocode	Yes	B Algorithm for computing the PHD
Open Source Code	Yes	We release code and data1 github.com/Ar Gintum/GPTID
Open Datasets	Yes	Our main dataset of human texts is Wiki40b [Guo et al., 2020].
Dataset Splits	Yes	We split data into train / validation / test sets in proportion 80%/10%/10%.
Hardware Specification	No	No specific hardware details (e.g., GPU models, CPU types, memory) used for running the experiments are provided in the paper.
Software Dependencies	No	The paper mentions using 'scikitdimensions library [Bac et al., 2021]' but does not provide a specific version number. Other software mentioned are models (RoBERTa-base, XLM-R) or general tools without versioning.
Experiment Setup	Yes	We consider consistent text samples of medium size, with length 300 tokens; ... In our experiments, we use Ro BERTa-base [Liu et al., 2019] for English and XLM-R [Goyal et al., 2021] for other languages. ... Finally, we construct a simple single-feature classifier for artificial text detection with PHD as the feature, training a logistic regression on some dataset of real and generated texts. ... We split data into train / validation / test sets in proportion 80%/10%/10%. ... Appendix B (Algorithm for computing the PHD) states: 'k = 8 is a good trade-off between speed of computation and variance of PHD estimation for our data (our sets of points vary between 50 and 510 in size). As for ˆn, we always used ˆn = 40. ... For all our experiments we took J = 7.'