Lyra: A Benchmark for Turducken-Style Code Generation
Authors: Qingyuan Liang, Zeyu Sun, Qihao Zhu, Wenjie Zhang, Lian Yu, Yingfei Xiong, Lu Zhang
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiment, we adopted Transformer, BERT-style, and GPT-style models as baselines. In the best setting, the generation performance of GPT-style models is better than others, where the AST exact matching accuracy is 24% and 25.5% when using Chinese and English comments, respectively. |
| Researcher Affiliation | Academia | 1Key Laboratory of High Confidence Software Technologies, Ministry of Education (Peking University). School of Computer Science, Peking University. Beijing, PR China 2School of Software & Microelectronics, Peking University. Beijing, PR China |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | The Lyra dataset and code is avaliable at https://github.com/LIANGQINGYUAN/Lyra. |
| Open Datasets | Yes | The Lyra dataset and code is avaliable at https://github.com/LIANGQINGYUAN/Lyra. |
| Dataset Splits | Yes | We randomly selected the 10% of 2,000 examples in our dataset for testing and validation respectively, and the remaining 80% for training. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned in the paper. |
| Software Dependencies | No | The paper mentions Python and tools like Pylint, but does not provide specific version numbers for these or other software dependencies required for reproducibility. |
| Experiment Setup | No | The paper mentions the models used and dataset splits but does not provide specific hyperparameter values or detailed training configurations for the experimental setup. |