Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
TreeGen: A Tree-Based Transformer Architecture for Code Generation
Authors: Zeyu Sun, Qihao Zhu, Yingfei Xiong, Yican Sun, Lili Mou, Lu Zhang8984-8991
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated Tree Gen on a Python benchmark, Hearth Stone, and two semantic parsing benchmarks, ATIS and GEO. Tree Gen outperformed the previous state-of-the-art approach by 4.5 percentage points on Hearth Stone, and achieved the best accuracy among neural network-based approaches on ATIS (89.1%) and GEO (89.6%). We also conducted an ablation test to better understand each component of our model. |
| Researcher Affiliation | Academia | Zeyu Sun, Qihao Zhu, Yingfei Xiong, Yican Sun, Lili Mou, Lu Zhang Key Laboratory of High Confidence Software Technologies (Peking University), Mo E; Software Institute, Peking University, 100871, P. R. China EMAIL University of Alberta, Edmonton, AB, Canada EMAIL |
| Pseudocode | No | The paper describes the architecture and various components in detail but does not include any explicit pseudocode blocks or algorithms. |
| Open Source Code | Yes | The code is available at https://github.com/zysszy/Tree Gen |
| Open Datasets | Yes | We followed the train-dev-test split in Ling et al. (2016), and the statistic is listed in Table 2. |
| Dataset Splits | Yes | We followed the train-dev-test split in Ling et al. (2016), and the statistic is listed in Table 2. |
| Hardware Specification | Yes | It takes 18s for an epoch on a single Nvidia Titan XP |
| Software Dependencies | No | The paper mentions "Adafactor (Shazeer and Stern 2018)" as the optimizer but does not specify versions for other key software components or libraries (e.g., Python, TensorFlow, PyTorch). |
| Experiment Setup | Yes | For neural networks, we set the number of NL reader layers Nd = 6, and N1 = N2 = 5 for the AST reader as well as the decoder. The size of all embedding is 256. The hidden sizes were all set to the 256 except each fully-connected layers, except the first layer was 1024 dimensions. We applied dropout after each layer (including attention layers, gating mechanism layers, convolutional layers, and fully-connected layers, where the drop rate is 0.15). The model is optimized by Adafactor (Shazeer and Stern 2018) with default parameters. |