reproducibilityindex.ai

Code Generation as a Dual Task of Code Summarization

Authors: Bolin Wei, Ge Li, Xin Xia, Zhiyi Fu, Zhi Jin

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on two datasets collected from Git Hub, and experimental results show that our dual framework can improve the performance of CS and CG tasks over baselines.
Researcher Affiliation	Academia	Key Laboratory of High Conﬁdence Software Technologies (Peking University) Ministry of Education, China; Software Institute, Peking University, China Faculty of Information Technology, Monash University, Australia
Pseudocode	Yes	Algorithm 1 Algorithm Description
Open Source Code	No	The paper states 'Our implementation is based on Py Torch.2' with a footnote linking to 'https://pytorch.org/', which is the official PyTorch website, not the authors' specific implementation code for this paper. No explicit statement about releasing their own source code is found.
Open Datasets	Yes	We conduct our CS and CG experiments on two datasets, including a Java dataset [Hu et al., 2018b] and a Python dataset [Wan et al., 2018]. The original Python dataset is collected by Barone and Sennrich [2017].
Dataset Splits	Yes	Each dataset is split into training, test and validation sets by 8:1:1.
Hardware Specification	No	The paper mentions 'to fit GPU memory' but does not provide specific details such as GPU models (e.g., NVIDIA A100, RTX 2080 Ti), CPU models, or other hardware specifications used for experiments.
Software Dependencies	No	The paper states 'Our implementation is based on Py Torch', but does not provide specific version numbers for PyTorch or any other software libraries or dependencies used.
Experiment Setup	Yes	We set the token embeddings and LSTM states both to 512 dimensions for the CS model and set the LSTM states to 256 dimensions for the CG model... The dropout rates of all models are set to 0.2 and mini-batch sizes of all models to 32. For dual learning process, we observe that the SGD is appropriate with initial learning rate 0.2... The λdual1 and λdual2 are set to 0.001 and 0.01 respectively, and the λatt1 and λatt2 are set to 0.01 and 0.1. We use beam search in the inference process, whose size is set to 10... Adam is chosen as our optimizer, and the initial learning rate is set to 0.002.