Repoformer: Selective Retrieval for Repository-Level Code Completion
Authors: Di Wu, Wasi Uddin Ahmad, Dejiao Zhang, Murali Krishna Ramanathan, Xiaofei Ma
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using this LM as both the selective RAG policy and the generation model, our framework achieves stateof-the-art repository-level code completion performance on diverse benchmarks including Repo Eval, Cross Code Eval, and Cross Code Long Eval, a new long-form code completion benchmark. |
| Researcher Affiliation | Collaboration | 1University of California Los Angeles 2AWS AI Labs. |
| Pseudocode | Yes | The full algorithms are presented in Appendix D. Algorithm 1 REPOFORMER Training Data Creation (Chunk Completion), Algorithm 2 REPOFORMER Training Data Creation (Function Completion) |
| Open Source Code | No | To facilitate future research on repository-level code completion, we will release our implementation and the Cross Code Long Eval benchmark at https://repoformer.github.io/. |
| Open Datasets | Yes | We leverage large-scale permissively licensed repositories from the Stack (Kocetkov et al., 2022) and create the fine-tuning data via a three-step procedure: [...] We perform comprehensive evaluations on a range of repository-level code completion tasks from Repo Eval (Zhang et al., 2023), Cross Code Eval (Ding et al., 2023), and Cross Code Long Eval a new large-scale benchmark focusing on code chunk and function completion. |
| Dataset Splits | Yes | We reserve 500 repositories for validation and use the rest for training. |
| Hardware Specification | Yes | The models are trained for 2 epochs, which approximately takes 8, 12, 20, and 50 hours for the 1B/3B/7B/16B models respectively with 8 Nvidia A100 GPUs (40G memory). Using this latency model, we benchmark the latency of various selective retrieval settings on Repo Eval with the vllm library (Kwon et al., 2023) on a single Nvidia A100 GPU (80G). |
| Software Dependencies | No | The paper mentions 'vllm library (Kwon et al., 2023)' and 'tree-sitter' but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We fine-tune the 1B, 3B, 7B, and 16B variants of Star Coder Base with λ = 1.0, maximum sequence length 2048, learning rate 2e-5, batch size 512, 50 warmup steps, and a linear learning rate decay. The models are trained for 2 epochs |