LongCoder: A Long-Range Pre-trained Language Model for Code Completion
Authors: Daya Guo, Canwen Xu, Nan Duan, Jian Yin, Julian Mcauley
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on a newly constructed dataset that contains longer code context and the publicly available Code XGLUE benchmark. Experimental results demonstrate that Long Coder achieves superior performance on code completion tasks compared to previous models while maintaining comparable efficiency in terms of computational resources during inference. |
| Researcher Affiliation | Collaboration | 1Sun Yat-sen University 2University of California, San Diego 3Microsoft Research Asia. |
| Pseudocode | No | The paper describes the model architecture and attention mechanisms using mathematical equations but does not present pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | All the codes and data are available at https://github. com/microsoft/Code BERT. |
| Open Datasets | Yes | To evaluate the effectiveness of Long Coder and encourage future research on Long Code Completion, we construct a new dataset called LCC... Specifically, we construct our datasets from the github-code2 dataset, which contains a vast number of code files sourced from Git Hub with an open-source license that permits research use. (https://huggingface.co/datasets/codeparrot/github-code) |
| Dataset Splits | Yes | For each programming language, we sample 100k examples for training, and 10k examples for development and 10k for testing. |
| Hardware Specification | Yes | The inference memory consumption and runtime per example are calculated using a beam search with beam size of 5 and maximum generation length of 64 on a single V100 GPU. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'tree-sitter' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | During finetuning, we use the Adam optimizer with a batch size of 16 and a learning rate of 2e-4. We fine-tune the model for 10 epochs and perform early stopping on the development set. |