reproducibilityindex.ai

Generating Adversarial Examples for Holding Robustness of Source Code Processing Models

Authors: Huangzhao Zhang, Zhuo Li, Ge Li, Lei Ma, Yang Liu, Zhi Jin1169-1176

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our in-depth evaluation on a functionality classiﬁcation benchmark demonstrates the effectiveness of MHM in generating adversarial examples of source code. The higher robustness and performance enhanced through our adversarial training with MHM further conﬁrms the usefulness of DL models-based method for future fully automated source code processing. Experiments In this section, we perform in-depth evaluation to demonstrate the usefulness of our proposed technique. We ﬁrst introduce our experimental setups, and then present the experimental results of MHM on adversarial attack and adversarial training, respectively.
Researcher Affiliation	Academia	Huangzhao Zhang,1 Zhuo Li,1 Ge Li,1 Lei Ma,2 Yang Liu,3 Zhi Jin1 1Key Lab of High Conﬁdence Software Technologies (Peking University, China), Ministry of Education 2Kyushu University, Japan, 3Nanyang Technology University, Singapore
Pseudocode	Yes	Algorithm 1 Metropolis-Hastings Modiﬁer algorithm.
Open Source Code	Yes	Our tool and data are open-sourced and publicly available1. 1https://github.com/Metropolis-Hastings-Modiﬁer/MHM
Open Datasets	Yes	Dataset. We choose the Open Judge (OJ) dataset, the benchmark dataset in source code classiﬁcation, as the study subject, which is proposed by Mou et al. 2016.
Dataset Splits	Yes	Finally, we split the ﬁltered dataset (4 : 1), resulting in a training set with the size of 38,924 and a test set with the size of 9,718... During training, we randomly extract 20% code ﬁles from the training set, forming the validation set.
Hardware Specification	No	The paper mentions that 'GA experiments take several days, while MHM only takes about several hours', indicating computational time, but it does not provide any specific hardware details such as GPU models, CPU specifications, or memory, that were used to run the experiments.
Software Dependencies	No	The paper mentions using a 'C++ (ver.11) parser' for filtering and the 'pycparser tool2' for AST generation, and optimizers 'Adam' and 'Ada Max' in Table 1. However, it does not provide version numbers for pycparser or the optimizers, nor for general programming languages or frameworks like Python, PyTorch, or TensorFlow, which are typically listed as key software dependencies with specific versions.
Experiment Setup	Yes	Table 1: Hyper-parameters of the subject models. This table explicitly lists detailed hyperparameters for LSTM and ASTNN models including 'Vocabulary size', 'Embedding size', 'Hidden size', 'Layers', 'Dropout', 'Batch size', 'Optimizer', and 'Learning rate'.