On Robust Prefix-Tuning for Text Classification

Authors: Zonghan Yang, Yang Liu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three text classification benchmarks show that our framework substantially improves robustness over several strong baselines against five textual attacks of different types while maintaining comparable accuracy on clean texts.
Researcher Affiliation Academia Zonghan Yang, Yang Liu Department of Computer Science and Technology, Institute for AI Industry Research Institute for Artificial Intelligence, Tsinghua University, Beijing, 100084, China
Pseudocode No The paper does not contain explicit pseudocode or algorithm blocks.
Open Source Code Yes We release the code at https://github.com/minicheshire/Robust-Prefix-Tuning
Open Datasets Yes We consider three text classification benchmarks in our experiments: binary Stanford Sentiment Treebank (SST-2) (Socher et al., 2013), AG s News (Zhang et al., 2015), and Stanford Natural Language Inference (SNLI) (Bowman et al., 2015). Table 10: Dataset statistics for each benchmark.
Dataset Splits Yes Table 10: Dataset statistics for each benchmark. We have also included the number of classes in each benchmark and the accuracy of random classifier in theory for better understanding.
Hardware Specification Yes We use NVIDIA-3090 GPUs for all of our experiments.
Software Dependencies No The paper mentions software such as "Hugging Face Transformers library", "Open Attack toolkit", "Py Torch", and "Num Py" with citations, but does not specify their version numbers used in the experiments.
Experiment Setup Yes We train 100 epochs for SST-2 and 25 epochs for AG s News and SNLI. We use the Adam W optimizer (Loshchilov & Hutter, 2019) provided by the Hugging Face Transformers library (Wolf et al., 2020) to optimize the prefix with initial learning rate as 5e-5 in all experiments. Other settings of prefix-tuning follows Li & Liang (2021). We set N = 3 and record the bottom N-layer activations of the LM at the output position for the additional tuning.