Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts
Authors: Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Sergey Nikolenko, Evgeny Burnaev, Serguei Barannikov, Irina Piontkovskaya
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments |
| Researcher Affiliation | Collaboration | 1Skolkovo Institute of Science and Technology, Russia; 2AI Foundation and Algorithm Lab, Russia; 3Artificial Intelligence Research Institute (AIRI), Russia;4CNRS, Université Paris Cité, France; 5St. Petersburg Department of the Steklov Institute of Mathematics, Russia |
| Pseudocode | Yes | B Algorithm for computing the PHD |
| Open Source Code | Yes | We release code and data1 github.com/Ar Gintum/GPTID |
| Open Datasets | Yes | Our main dataset of human texts is Wiki40b [Guo et al., 2020]. |
| Dataset Splits | Yes | We split data into train / validation / test sets in proportion 80%/10%/10%. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, memory) used for running the experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions using 'scikitdimensions library [Bac et al., 2021]' but does not provide a specific version number. Other software mentioned are models (RoBERTa-base, XLM-R) or general tools without versioning. |
| Experiment Setup | Yes | We consider consistent text samples of medium size, with length 300 tokens; ... In our experiments, we use Ro BERTa-base [Liu et al., 2019] for English and XLM-R [Goyal et al., 2021] for other languages. ... Finally, we construct a simple single-feature classifier for artificial text detection with PHD as the feature, training a logistic regression on some dataset of real and generated texts. ... We split data into train / validation / test sets in proportion 80%/10%/10%. ... Appendix B (Algorithm for computing the PHD) states: 'k = 8 is a good trade-off between speed of computation and variance of PHD estimation for our data (our sets of points vary between 50 and 510 in size). As for ˆn, we always used ˆn = 40. ... For all our experiments we took J = 7.' |