reproducibilityindex.ai

Lyra: A Benchmark for Turducken-Style Code Generation

Authors: Qingyuan Liang, Zeyu Sun, Qihao Zhu, Wenjie Zhang, Lian Yu, Yingfei Xiong, Lu Zhang

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiment, we adopted Transformer, BERT-style, and GPT-style models as baselines. In the best setting, the generation performance of GPT-style models is better than others, where the AST exact matching accuracy is 24% and 25.5% when using Chinese and English comments, respectively.
Researcher Affiliation	Academia	1Key Laboratory of High Confidence Software Technologies, Ministry of Education (Peking University). School of Computer Science, Peking University. Beijing, PR China 2School of Software & Microelectronics, Peking University. Beijing, PR China
Pseudocode	No	No pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	The Lyra dataset and code is avaliable at https://github.com/LIANGQINGYUAN/Lyra.
Open Datasets	Yes	The Lyra dataset and code is avaliable at https://github.com/LIANGQINGYUAN/Lyra.
Dataset Splits	Yes	We randomly selected the 10% of 2,000 examples in our dataset for testing and validation respectively, and the remaining 80% for training.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned in the paper.
Software Dependencies	No	The paper mentions Python and tools like Pylint, but does not provide specific version numbers for these or other software dependencies required for reproducibility.
Experiment Setup	No	The paper mentions the models used and dataset splits but does not provide specific hyperparameter values or detailed training configurations for the experimental setup.