TreeGen: A Tree-Based Transformer Architecture for Code Generation
Authors: Zeyu Sun, Qihao Zhu, Yingfei Xiong, Yican Sun, Lili Mou, Lu Zhang8984-8991
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated Tree Gen on a Python benchmark, Hearth Stone, and two semantic parsing benchmarks, ATIS and GEO. Tree Gen outperformed the previous state-of-the-art approach by 4.5 percentage points on Hearth Stone, and achieved the best accuracy among neural network-based approaches on ATIS (89.1%) and GEO (89.6%). We also conducted an ablation test to better understand each component of our model. |
| Researcher Affiliation | Academia | Zeyu Sun, Qihao Zhu, Yingfei Xiong, Yican Sun, Lili Mou, Lu Zhang Key Laboratory of High Confidence Software Technologies (Peking University), Mo E; Software Institute, Peking University, 100871, P. R. China {szy , zhuqh, xiongyf, sycpku, zhanglucs}@pku.edu.cn University of Alberta, Edmonton, AB, Canada doublepower.mou@gmail.com |
| Pseudocode | No | The paper describes the architecture and various components in detail but does not include any explicit pseudocode blocks or algorithms. |
| Open Source Code | Yes | The code is available at https://github.com/zysszy/Tree Gen |
| Open Datasets | Yes | We followed the train-dev-test split in Ling et al. (2016), and the statistic is listed in Table 2. |
| Dataset Splits | Yes | We followed the train-dev-test split in Ling et al. (2016), and the statistic is listed in Table 2. |
| Hardware Specification | Yes | It takes 18s for an epoch on a single Nvidia Titan XP |
| Software Dependencies | No | The paper mentions "Adafactor (Shazeer and Stern 2018)" as the optimizer but does not specify versions for other key software components or libraries (e.g., Python, TensorFlow, PyTorch). |
| Experiment Setup | Yes | For neural networks, we set the number of NL reader layers Nd = 6, and N1 = N2 = 5 for the AST reader as well as the decoder. The size of all embedding is 256. The hidden sizes were all set to the 256 except each fully-connected layers, except the first layer was 1024 dimensions. We applied dropout after each layer (including attention layers, gating mechanism layers, convolutional layers, and fully-connected layers, where the drop rate is 0.15). The model is optimized by Adafactor (Shazeer and Stern 2018) with default parameters. |