ALISON: Fast and Effective Stylometric Authorship Obfuscation

Authors: Eric Xing, Saranya Venkatraman, Thai Le, Dongwon Lee

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental ALISON: Fast and Effective Stylometric Authorship Obfuscation Eric Xing1, Saranya Venkatraman2, Thai Le3, Dongwon Lee2 1Mc Kelvey School of Engineering, Washington University in St. Louis, MO, USA 2College of Information Sciences and Technology, The Pennsylvania State University, PA, USA 3School of Engineering, University of Mississippi, MS, USA e.xing@wustl.edu, saranyav@psu.edu, thaile@olemiss.edu, dongwon@psu.edu Authorship Attribution (AA) and Authorship Obfuscation (AO) are two competing tasks of increasing importance in privacy research. Modern AA leverages an author s consistent writing style to match a text to its author using an AA classifier. AO is the corresponding adversarial task, aiming to modify a text in such a way that its semantics are preserved, yet an AA model cannot correctly infer its authorship. To address privacy concerns raised by state-of-the-art (SOTA) AA methods, new AO methods have been proposed but remain largely impractical to use due to their prohibitively slow training and obfuscation speed, often taking hours. To this challenge, we propose a practical AO method, ALISON, that (1) dramatically reduces training/obfuscation time, demonstrating more than 10x faster obfuscation than SOTA AO methods, (2) achieves better obfuscation success through attacking three transformer-based AA methods on two benchmark datasets, typically performing 15% better than competing methods, (3) does not require direct signals from a target AA classifier during obfuscation, and (4) utilizes unique stylometric features, allowing sound model interpretation for explainable obfuscation. We also demonstrate that ALISON can effectively prevent four SOTA AA methods from accurately determining the authorship of Chat GPT-generated texts, all while minimally changing the original text semantics. To ensure the reproducibility of our findings, our code and data are available at: https://github.com/EricX003/ALISON. ... Experimental Setup Datasets. We use Turing Bench (Uchendu et al. 2021) to evaluate ALISON on machine-generated texts. ... Results Obfuscation Success. The experimental results on both datasets from our main obfuscation experiment are summarized by Table 1.
Researcher Affiliation Academia Eric Xing1, Saranya Venkatraman2, Thai Le3, Dongwon Lee2 1Mc Kelvey School of Engineering, Washington University in St. Louis, MO, USA 2College of Information Sciences and Technology, The Pennsylvania State University, PA, USA 3School of Engineering, University of Mississippi, MS, USA e.xing@wustl.edu, saranyav@psu.edu, thaile@olemiss.edu, dongwon@psu.edu
Pseudocode No The paper includes a figure (Figure 2) illustrating the obfuscation pipeline, which shows a flow but is not pseudocode or a clearly labeled algorithm block with structured steps. The text describes the process but not in a pseudocode format.
Open Source Code Yes To ensure the reproducibility of our findings, our code and data are available at: https://github.com/EricX003/ALISON.
Open Datasets Yes Datasets. We use Turing Bench (Uchendu et al. 2021) to evaluate ALISON on machine-generated texts. Turing Bench is a collection of 160K human and machine-generated texts across 20 authors, 19 of which are neural text generation models, and one of whom is human. We also use the Blog Authorship Corpus (Schler et al. 2006) to evaluate ALISON on human-written texts. The dataset consists of the aggregated blog posts of 19,320 bloggers gathered from blogger.com, of which we select only the blogs from the top-10 most frequent authors. Both datasets are publicly available. We report all AO results on the test set.
Dataset Splits Yes To evaluate our approach in this setting, we split each publicly available text classification corpus into three disjoint sets, X, X , and T stratified by unique authorship labels. ... Our neural-network-based n-gram classifier is trained on the disjoint 2nd half of the training and validation data that was not used to train our SOTA target models using V = {1, 2, 3, 4}.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It only discusses software and training times.
Software Dependencies No The paper mentions software like BERT, DistilBERT, RoBERTa, Chat GPT, and LLaMA2-7B. However, it does not provide specific version numbers for these or other key software components or libraries.
Experiment Setup No The paper mentions some aspects of the experimental setup, such as the use of V = {1, 2, 3, 4} for n-gram lengths and that the internal classifier is a "fully connected neural network". However, it lacks specific details like learning rates, batch sizes, number of epochs, or optimizer settings, which are typical for fully reproducing an experimental setup.