Coeditor: Leveraging Repo-level Diffs for Code Auto-editing
Authors: Jiayi Wei, Greg Durrett, Isil Dillig
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In a simplified single-round, single-edit task, Coeditor significantly outperforms GPT-3.5 and SOTA open-source code completion models (bringing exact-match accuracy from 34.7 up to 60.4), demonstrating the benefits of incorporating editing history for code completion. In a multi-round, multi-edit setting, we observe substantial gains by iteratively conditioning on additional user edits. We have open-sourced our code, data, and model weights to encourage future research and have released a VSCode extension powered by our model for interactive IDE usage. |
| Researcher Affiliation | Collaboration | Jiayi Wei Augment Computing, Inc. jiayi@augmentcode.com Greg Durrett, Isil Dillig University of Texas at Austin {gdurrett, isil}@cs.utexas.edu |
| Pseudocode | No | The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | We have open-sourced our code, data, and model weights to encourage future research and have released a VSCode extension powered by our model for interactive IDE usage. Available at https://github.com/mrvplusone/Coeditor. |
| Open Datasets | Yes | We collect a code editing dataset from the commit histories of 1650 open-source Python projects for training and evaluation. ... We release our source code, dataset, model checkpoint, as well as a VSCode extension that supports interactive usage to foster future research. Available at https://github.com/mrvplusone/Coeditor. |
| Dataset Splits | Yes | We use 50 of the projects for testing and 50 for validation and use the remaining 1,550 projects for training. Table 1: General statistics of the PYCOMMITS dataset. train 1550 valid 50 test 50 projects |
| Hardware Specification | Yes | Training took about 5 days on a single NVIDIA Quadro RTX 8000 GPU with 48 GB memory. |
| Software Dependencies | No | The paper mentions 'Huggingface s Trainer implementation' and 'Adam W optimizer', but does not provide specific version numbers for these or other software dependencies like Python, PyTorch, or other libraries used. |
| Experiment Setup | Yes | We initialize Coeditor with the Code T5-base checkpoint (220M parameters) and train the model on our training set for 1.75 epoch, gradually increasing the model reference context size from 2048 tokens to 4096 tokens (at epoch 1) and then to 8192 tokens (at epoch 1.5). We use Huggingface s Trainer implementation and the Adam W optimizer, with a linear learning rate schedule with a starting learning rate of 2e-5 and 0.01 weight decay. We train the model with a fixed batch size of 1 and a total of 1.34 million training steps. |