Interpretable Adversarial Perturbation in Input Embedding Space for Text
Authors: Motoki Sato, Jun Suzuki, Hiroyuki Shindo, Yuji Matsumoto
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted our experiments on a sentiment classification (SEC) task, a category classification (CAC) task, and a grammatical error detection (GED) task to evaluate the effectiveness of our methods, i Adv T-Text and i VAT-Text. |
| Researcher Affiliation | Collaboration | 1Preferred Networks, Inc., 2NTT Communication Science Laboratories, 3Nara Institute of Science and Technology, 4RIKEN Center for Advanced Intelligence Project |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code for reproducing our experiments is available at https: //github.com/aonotas/interpretable-adv |
| Open Datasets | Yes | For SEC, we used the following well-studied benchmark datasets, IMDB [Maas et al., 2011], Elec [Johnson and Zhang, 2015], and Rotten Tomatoes [Pang and Lee, 2005]. For CAC, we utilized DBpedia [Lehmann et al., 2015] and RCV1 [Lewis et al., 2004]. For GED, we utilized the First Certificate in the English dataset (FCE-public) [Yannakoudakis et al., 2011]. |
| Dataset Splits | Yes | Table 1: Summary of datasets and Following [Miyato et al., 2017], we split the original training data into training and development sentences. We utilized an early stopping criterion [Caruana et al., 2000] based on the performance measured on development sets. |
| Hardware Specification | No | The paper only mentions 'with GPU support' but does not specify any particular GPU model, CPU, or detailed hardware specifications used for experiments. |
| Software Dependencies | No | The paper states 'using Chainer [Tokui et al., 2015]', but does not provide a version number for Chainer or any other software dependencies. |
| Experiment Setup | Yes | The hyper-parameters are summarized in Table 2, with dropout [Srivastava et al., 2014] and Adam [Kingma and Ba, 2014]. In addition, we set ϵ = 5.0 for both Adv T-Text and VAT-Text and ϵ = 15.0 for our method. We also set λ = 1 for all the methods. We utilized an early stopping criterion [Caruana et al., 2000] based on the performance measured on development sets. |