reproducibilityindex.ai

Interpretable Adversarial Perturbation in Input Embedding Space for Text

Authors: Motoki Sato, Jun Suzuki, Hiroyuki Shindo, Yuji Matsumoto

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted our experiments on a sentiment classification (SEC) task, a category classification (CAC) task, and a grammatical error detection (GED) task to evaluate the effectiveness of our methods, i Adv T-Text and i VAT-Text.
Researcher Affiliation	Collaboration	1Preferred Networks, Inc., 2NTT Communication Science Laboratories, 3Nara Institute of Science and Technology, 4RIKEN Center for Advanced Intelligence Project
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code for reproducing our experiments is available at https: //github.com/aonotas/interpretable-adv
Open Datasets	Yes	For SEC, we used the following well-studied benchmark datasets, IMDB [Maas et al., 2011], Elec [Johnson and Zhang, 2015], and Rotten Tomatoes [Pang and Lee, 2005]. For CAC, we utilized DBpedia [Lehmann et al., 2015] and RCV1 [Lewis et al., 2004]. For GED, we utilized the First Certificate in the English dataset (FCE-public) [Yannakoudakis et al., 2011].
Dataset Splits	Yes	Table 1: Summary of datasets and Following [Miyato et al., 2017], we split the original training data into training and development sentences. We utilized an early stopping criterion [Caruana et al., 2000] based on the performance measured on development sets.
Hardware Specification	No	The paper only mentions 'with GPU support' but does not specify any particular GPU model, CPU, or detailed hardware specifications used for experiments.
Software Dependencies	No	The paper states 'using Chainer [Tokui et al., 2015]', but does not provide a version number for Chainer or any other software dependencies.
Experiment Setup	Yes	The hyper-parameters are summarized in Table 2, with dropout [Srivastava et al., 2014] and Adam [Kingma and Ba, 2014]. In addition, we set ϵ = 5.0 for both Adv T-Text and VAT-Text and ϵ = 15.0 for our method. We also set λ = 1 for all the methods. We utilized an early stopping criterion [Caruana et al., 2000] based on the performance measured on development sets.