Code Completion with Neural Attention and Pointer Networks

Authors: Jian Li, Yue Wang, Michael R. Lyu, Irwin King

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on two benchmarked datasets demonstrate the effectiveness of our attention mechanism and pointer mixture network on the code completion task.
Researcher Affiliation Academia Jian Li, Yue Wang, Michael R. Lyu, Irwin King Department of Computer Science and Engineering, The Chinese University of Hong Kong, China Shenzhen Research Institute, The Chinese University of Hong Kong, China {jianli, yuewang, lyu, king}@cse.cuhk.edu.hk
Pseudocode No The paper describes methods with equations and diagrams but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link to the source code for the methodology described.
Open Datasets Yes We evaluate different approaches on two benchmarked datasets: Java Script (JS) and Python (PY), which are summarized in Table 2. Collected from Git Hub, both two datasets are publicly available2 and used in previous work [Bielik et al., 2016; Raychev et al., 2016; Liu et al., 2016]. 2http://plml.ethz.ch
Dataset Splits Yes Both datasets contain 150,000 program files which are stored in their corresponding AST formats, with the first 100,000 used for training and the remaining 50,000 used for testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, cloud instances) used for running the experiments.
Software Dependencies No The paper mentions software components like 'LSTM network', 'Adam optimizer', and 'mini-batch SGD' but does not specify any version numbers for these or other software libraries.
Experiment Setup Yes Our base model is a single layer LSTM network with unrolling length of 50 and hidden unit size of 1500. To train the model, we use the cross entropy loss function and mini-batch SGD with the Adam optimizer [Kingma and Ba, 2014]. We set the initial learning rate as 0.001 and decay it by multiplying 0.6 after every epoch. We clip the gradients norm to 5 to prevent gradients exploding. The size of attention window is 50. The batch size is 128 and we train our model for 8 epochs.