Automatic Code Review by Learning the Revision of Source Code

Authors: Shu-Ting Shi, Ming Li, David Lo, Ferdian Thung, Xuan Huo4910-4917

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on six open source software projects indicate by learning the revision features, DACE can outperform the competing approaches in automatic code review. To evaluate the effectiveness of DACE, we conduct experiments on open source software projects and compare the results with several baselines.
Researcher Affiliation Academia Shu-Ting Shi,1 Ming Li,1,2 David Lo,3 Ferdian Thung,3 Xuan Huo1 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University, China 3School of Information Systems, Singapore Management University, Singapore
Pseudocode No The paper uses figures and textual descriptions to explain the model architecture and processes, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not provide any statement about making the source code for their proposed DACE model publicly available, nor does it provide a link to a code repository.
Open Datasets Yes The code review data used in the experiments is crawled from Apache projects1, which is a widely used code review source (Rigby and Bird 2013; Rigby and German 2006; Rigby, German, and Storey 2008). 1Apache Code Review Board, https://reviews.apache.org/r/
Dataset Splits Yes For each data set, 10-fold cross validation is repeated ten times, and we use F1 and AUC (Area Under ROC Curve) to measure the effectiveness of the DACE model, which have been widely applied for evaluating imbalanced data.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper mentions various models and techniques (e.g., CNN, LSTM, word2vec, Logistic Regression, SVM, DBN, GRU) and their configurations, but it does not specify version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup Yes We employ the most commonly used Re LU σ(x) = max(x, 0) as active function and the filter windows size is set as 2, 3, 4, with 100 feature maps each in CNN. The number of neuron dimension in LSTM is set as 300. The encoders and decoders in PAE are GRUs with the cell size of 256. And the MLP for final prediction is two layers of a fully connected network of size 256 and 100. The cost weights ca and cr are set inversely proportional to class instance numbers.