Repository-Level Prompt Generation for Large Language Models of Code
Authors: Disha Shrivastava, Hugo Larochelle, Daniel Tarlow
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on the task of single-line code auto-completion using code repositories taken from Google Code archives. We demonstrate that an oracle constructed from our prompt proposals gives a relative improvement of 36% over Codex, showing the quality of these proposals. Further, we show that when we train a model to predict a prompt proposal, we can achieve significant performance gains over Codex and other baselines. |
| Researcher Affiliation | Collaboration | Disha Shrivastava 1 2 Hugo Larochelle 1 2 3 4 Daniel Tarlow 1 5 3 1Mila 2Université de Montréal 3Google 4CIFAR Associate Fellow 5Mc Gill University. |
| Pseudocode | No | The paper describes the methods in prose but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We release our code, data, and trained checkpoints at https: //github.com/shrivastavadisha/ repo_level_prompt_generation. |
| Open Datasets | Yes | Instead, we scraped Google Code https: //code.google.com/archive/ for repositories in Java (removing the ones that matched with a repository on Git Hub with the same name). |
| Dataset Splits | Yes | We divided the repositories into train, validation, and test splits, where each repository in its entirety is part of a split. ... # Repositories 19 14 14 47 # Files 2655 1060 1308 4757 # Holes 92721 48548 48288 189557 |
| Hardware Specification | Yes | The computational complexity of training our larger RLPG-R variant (3.6M parameters, 141269 holes, and 9.19 minutes per epoch on a single Tesla V100 GPU) is much smaller than finetuning all or some part of Codex (175B parameters). ... Besides training the PPC, all our experiments were performed on a CPU with 8GB RAM. |
| Software Dependencies | Yes | We used the Open AI Codex Completions API for generating the predicted hole from the Codex model. In particular, we used the code-davinci-001 engine with the temperature set to 0.0 and stop criteria as a newline. ... We used the tree-sitter API for Java ... For the BM25-based baselines use the Okapi BM25 implementation with default parameters given by the pip package rank-bm25 0.2.2 10. ... We used Code BERT (Feng et al., 2020) as our pretrained model Fϕ... |
| Experiment Setup | Yes | We used the Open AI Codex Completions API for generating the predicted hole from the Codex model. In particular, we used the code-davinci-001 engine with the temperature set to 0.0 and stop criteria as a newline. The completion length was 24 and the maximum prompt length was 4072. ... We used Adam (Kingma & Ba, 2015) optimizer with a learning rate of 3e-4 and batch size of 64. ... A dropout value of 0.25 was used while training. |