On-the-Fly Adapting Code Summarization on Trainable Cost-Effective Language Models

Authors: Yufan Cai, Yun Lin, Chenyan Liu, Jinglian Wu, Yifan Zhang, Yiming Liu, Yeyun Gong, Jin Song Dong

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments on 7 comment generators and 4 public datasets show that (1) Ada Com can significantly boost the performance of comment generation (BLEU4 score by on average 14.9%, METEOR by 12.2%, and ROUGE-L by 7.4%), (2) the adaptation on one code sample is cost-effective and acceptable as an on-the-fly solution, and (3) Ada Com can adapt well on out-of-distribution code samples.
Researcher Affiliation Collaboration Yufan Cai Shanghai Jiao Tong University National University of Singapore cai_yufan@u.nus.edu Chenyan Liu National University of Singapore chenyan@u.nus.edu Jinglian Wu National University of Singapore jinglian_wu@u.nus.edu Yifan Zhang National University of Singapore yfzhang@nus.edu.sg Yiming Liu National University of Singapore e0945794@u.nus.edu Yeyun Gong Microsoft yegong@microsoft.com Jin Song Dong National University of Singapore dcsdjs@nus.edu.sg
Pseudocode No The paper describes the steps of Ada Com in text and with a diagram (Figure 1), but it does not include a formal pseudocode block or algorithm listing.
Open Source Code Yes All the experiment details and replication details can be referred to our website [6].
Open Datasets Yes In this experiment, we adopt four public datasets including Code Search Net [19] (CSN), Code KG [9], Fun Com [21] and Cos Bench [41].
Dataset Splits Yes Table 3: Experiment Datasets: Dataset Train Valid Test Fun Com 1,954,807 104,273 90,908 Cos Bench 296,425 42,348 84,694 Code KG 161,857 20,282 40,512 CSN-Python 251,820 13,914 14,918 CSN-PHP 241,241 12,982 14,014 CSN-Go 167,288 7,325 8,122 CSN-Java 164,923 5,183 10,955 CSN-Java Script 58,025 3,885 3,291 CSN-Ruby 24,927 1,400 1,261
Hardware Specification Yes Our experiments are conducted on 2 Ubuntu 20.04 servers equipped with 2 AMD Ryzen TM 9 5950X 16-Core CPU, 128 GB memory, and 2 Nvidia RTXTM A4000 GPU cards.
Software Dependencies No The paper mentions "Ubuntu 20.04" as the operating system, but does not provide specific version numbers for other key software components, such as deep learning frameworks (e.g., PyTorch, TensorFlow) or specific library versions used for implementation.
Experiment Setup No The paper mentions using a "dropout training strategy" and "early stop mechanism" and freezing