EffiLearner: Enhancing Efficiency of Generated Code via Self-Optimization
Authors: Dong HUANG, Jianbo Dai, Han Weng, Puzhen Wu, Yuhao QING, Heming Cui, Zhijiang Guo, Jie Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the effectiveness of EFFILEARNER, we conduct extensive experiments on the Effi Bench, Human Eval, and MBPP with 16 open-source and 6 closed-source models. Our evaluation results demonstrate that through iterative self-optimization, EFFI-LEARNER significantly enhances the efficiency of LLM-generated code. |
| Researcher Affiliation | Academia | Dong Huang The University of Hong Kong dhuang@cs.hku.hk Jianbo Dai University of Edinburgh j6dj6d@gmail.com Han Weng Beijing University of Posts and Telecommunications han.weng@bupt.edu.cn Puzhen Wu University College Dublin puzhen.wu@ucdconnect.ie Yuhao Qing The University of Hong Kong yhqing@cs.hku.hk Heming Cui The University of Hong Kong Shanghai AI Laboratory heming@cs.hku.hk Zhijiang Guo University of Cambridge zg283@cam.ac.uk Jie M. Zhang King s College London jie.zhang@kcl.ac.uk |
| Pseudocode | No | The paper describes the framework components and provides code examples in Python, but does not contain a dedicated pseudocode or algorithm block. |
| Open Source Code | Yes | The source code of EFFI-LEARNER was released in https://github.com/huangd1999/Effi Learner. |
| Open Datasets | Yes | To evaluate the effectiveness of EFFI-LEARNER, we conduct extensive experiments on the Effi Bench, Human Eval, and MBPP with 16 open-source and 6 closed-source models. We evaluate EFFI-LEARNER on Effi Bench [27]. For Human Eval and MBPP datasets, we set the test cases provided by Human Eval and MBPP as open test cases, while test cases provided by Eval Plus [35] (i.e., Human Eval-Plus, MBPP-Plus) as private test cases that were used to calculate the final results. |
| Dataset Splits | Yes | Following Huang et al. [27], we utilize the open test cases to calculate the efficiency metrics during the self-optimization process, while private test cases provided by Effi Bench were used for the final result evaluation. For Human Eval and MBPP datasets, we set the test cases provided by Human Eval and MBPP as open test cases, while test cases provided by Eval Plus [35] (i.e., Human Eval-Plus, MBPP-Plus) as private test cases that were used to calculate the final results. |
| Hardware Specification | Yes | All of the experiments are conducted in an edge server with an Intel Xeon Platinum 8336C CPU with 128 cores, and 8 * NVIDIA A100-SXM GPUs Total memory capacity of 2.0Ti B. |
| Software Dependencies | No | The paper mentions using 'Python', 'line_profiler library', and 'memory_profiler library' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We carefully design prompts to guide LLMs in optimizing code efficiency while ensuring the optimized code passes predefined test cases. The prompt template (Figure 3) used in EFFI-LEARNER s self-optimization stage includes a task description, test case, initial code, overhead analysis, and optimization rules. To investigate the impact of the number of self-optimization steps on the efficiency of the EFFILEARNER-optimized code, we conduct an ablation study by varying the number of steps from 0 to 5. |