reproducibilityindex.ai

Deep Text Classification Can be Fooled

Authors: Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, Wenchang Shi

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experiment results show that the adversarial samples generated by our method can successfully fool both state-of-the-art character-level and word-level DNN-based text classiﬁers. The attack experiments show that despite the conciseness, our method can perform effective source/target misclassiﬁcation attack against both DNNs and the adversarial samples generated by our three strategies satisfy all the requirements, i.e., fooling the target DNN, imperceptible perturbations and utility-preserving. We evaluate the effectiveness of our method by answering the following questions. Q1: Can our method perform effective source/target misclassiﬁcation attack? Q2: Can the adversarial samples avoid being distinguished by human observers and still keep the utility? Q3: Is our method efﬁcient enough? Q4: White-box and black-box, which is more powerful?
Researcher Affiliation	Academia	Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li and Wenchang Shi School of Information, Renmin University of China, Beijing, China Key laboratory of Data Engineering and Knowledge Engineering, MOE, Beijing, China {liangb, owenlee, sumiaoqiang, bianpan, xirong, wenchang}@ruc.edu.cn
Pseudocode	No	The paper describes the methods narratively and with diagrams (e.g., Figure 9) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for their proposed method.
Open Datasets	Yes	One is a character-level model [Zhang et al., 2015] and the other is word-level [Kim, 2014]. The character-level DNN is trained on a DBpedia ontology dataset, which contains 560,000 training samples and 70,000 testing samples of 14 high-level classes, such as Company, Building, Film and so on. The model is tested on several datasets, including MR, CR and MPQA. The MR dataset is a movie review repository (containing 10,662 reviews) while CR contains 3,775 reviews about products, e.g. a music player. MPQA contains 10,606 opinions.
Dataset Splits	No	The paper explicitly mentions 'training samples' and 'testing samples' for the DBpedia dataset but does not specify a separate validation split or how one would be derived. For example: 'DBpedia ontology dataset, which contains 560,000 training samples and 70,000 testing samples'.
Hardware Specification	No	The paper mentions running experiments 'on a desktop computer' but does not provide specific details such as CPU model, GPU model, or memory specifications. For example: 'The white-box attack took 116 hours in total to compute the cost gradient and identify HTPs for all the 14 classes of the DBpedia dataset (8.29 hours per class) on a desktop computer.'
Software Dependencies	No	The paper refers to target DNN models by citing their original papers (e.g., '[Zhang et al., 2015]' for the character-level model and '[Kim, 2014]' for the word-level model) and describes their architectures. However, it does not specify the software libraries, frameworks (e.g., TensorFlow, PyTorch), or their versions that were used to implement or run their own adversarial attack method.
Experiment Setup	No	The paper describes the architecture of the target DNN models (e.g., 'Through six convolutional layers and three fully-connected layers' for the character-level DNN; 'one convolutional layer, followed by a max pooling layer and a fully connected layer with dropout' for the word-level model). It also details the process for generating adversarial samples (insertion, modification, removal strategies). However, it does not provide specific hyperparameters for their own method's training or application (e.g., learning rates, batch sizes, epochs for any internal model training) or for the training of the target models used in their experiments, nor does it detail other system-level experimental settings.