reproducibilityindex.ai

Learning Structural Edits via Incremental Tree Transformations

Authors: Ziyu Yao, Frank F. Xu, Pengcheng Yin, Huan Sun, Graham Neubig

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our proposed editor on two source code edit datasets, where results show that, with the proposed edit encoder, our editor signiﬁcantly improves accuracy over previous approaches that generate the edited program directly in one pass.
Researcher Affiliation	Academia	Ziyu Yao Frank F. Xu, Pengcheng Yin The Ohio State University Carnegie Mellon University yao.470@osu.edu {fangzhex,pcyin}@cs.cmu.edu Huan Sun Graham Neubig The Ohio State University Carnegie Mellon University sun.397@osu.edu gneubig@cs.cmu.edu
Pseudocode	Yes	Algorithm 1 DAGGERSAMPLING...Algorithm 2 POSTREFINESAMPLING...Algorithm 3 TREESHORTESTDIST
Open Source Code	Yes	Our source code is available at https://github.com/neulab/incremental_tree_edit.
Open Datasets	Yes	We test our methods on two source code edit datasets introduced by Yin et al. (2019), also largely following their experimental setting.
Dataset Splits	Yes	The Git Hub Edits (GHE) dataset contains C , C+ pairs and their surrounding context collected from the commit logs of 54 Git Hub C# projects. The dataset is split into train/dev/test sets of 91,372 / 10,176 / 10,176 samples.
Hardware Specification	No	The paper describes the computational models and experimental setups in terms of software and data but does not provide specific details about the hardware used for running the experiments (e.g., CPU, GPU models, or cloud computing instances).
Software Dependencies	No	The paper mentions various software components and frameworks used (e.g., LSTM, GGNN, ASDL), but it does not specify exact version numbers for these or other software dependencies.
Experiment Setup	Yes	For the encoder of our neural editor, the dimension of word embedding and the tree node representation is set to 128. The dimension of the bidirectional LSTM encoder for encoding input code tokens and contexts is set to 64. The hidden state for tracking tree history is set to 256 dimensions. In the decoder side, the dimensions of the operator embedding, the ﬁeld embedding, the production rule embedding, and the hidden vector in value prediction are set to 32, 32, 128 and 256, respectively. ... we train our Graph2Edit for 30 epochs on Git Hub Edits training set...