Learning to Complete Code with Sketches
Authors: Daya Guo, Alexey Svyatkovskiy, Jian Yin, Nan Duan, Marc Brockschmidt, Miltiadis Allamanis
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, GRAMMFORMER generates 10-50% more accurate completions compared to traditional generative models and 37-50% longer sketches compared to sketch-generating baselines trained with similar techniques. |
| Researcher Affiliation | Collaboration | Daya Guo School of Computer Science and Engineering Sun Yat-sen University, China guody5@mail2.sysu.edu.cn Alexey Svyatkovskiy Microsoft Redmond, WA, USA alsvyatk@microsoft.com Jian Yin School of Computer Science and Engineering Sun Yat-sen University, China issjyin@mail.sysu.edu.cn Nan Duan Microsoft Research Beijing, China nanduan@microsoft.com Marc Brockschmidt, Miltiadis Allamanis Microsoft Research Cambridge, UK {mabrocks,miallama}@microsoft.com |
| Pseudocode | Yes | Algorithm 1 GRAMMFORMER generative process, given an input sequence x(0). |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | To collect a dataset, we clone all non-fork repositories with more than 20 stars on Git Hub that have C# or Python as their top language. |
| Dataset Splits | Yes | Finally, we split the files into 70-10-20 train-validation-test. |
| Hardware Specification | Yes | Training used 64 NVIDIA Tesla P100 with 16GB memory for 10 days. |
| Software Dependencies | Yes | Finally, we parse all files into a syntax tree using Treesitter, ignoring any files that cannot be parsed using the v0.19.0 grammar definitions. |
| Experiment Setup | Yes | Most of our models use a 6-layer Transformer as encoder and 6-layer Transformer as decoder, each with a hidden dimension of 768 and 12 attention heads, with the exception of the LM model (and its variations), which uses a single 12-layer Transformer, to match the number of parameters of the other models. We set the intermediate dimension of each Transformer layer as 3072... We set max length of input and output sequences as 512 and 64, respectively. We train the model with Adam optimiser using a learning rate of 2e-5 and batch size 4 096. |