reproducibilityindex.ai

Break-It-Fix-It: Unsupervised Learning for Program Repair

Authors: Michihiro Yasunaga, Percy Liang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate BIFI on two code repair datasets: Git Hub-Python, a new dataset we introduce where the goal is to repair Python code with AST parse errors; and Deep Fix, where the goal is to repair C code with compiler errors. BIFI outperforms state-of-the-art methods, obtaining 90.5% repair accuracy on Git Hub Python (+28.5%) and 71.7% on Deep Fix (+5.6%).
Researcher Affiliation	Academia	Michihiro Yasunaga 1 Percy Liang 1 1Stanford University, Stanford, CA.
Pseudocode	No	The paper describes the algorithm steps in paragraph text and equations (Eq 3-9) but does not include formal pseudocode or a clearly labeled algorithm block.
Open Source Code	Yes	Code and data are available at https://github.com/michiyasunaga/bifi.
Open Datasets	Yes	Code and data are available at https://github.com/michiyasunaga/bifi. (for GitHub-Python) and https://bitbucket.org/iiscseal/deepfix (for Deep Fix).
Dataset Splits	Yes	We holdout 1% of Psynthetic as our dev set, which we use to perform early stopping.
Hardware Specification	Yes	on one GPU (GTX Titan X).
Software Dependencies	No	The paper mentions using specific algorithms like Transformer and Adam, but does not provide specific version numbers for software dependencies such as Python, PyTorch, or TensorFlow libraries.
Experiment Setup	Yes	For the architecture of the ﬁxer and breaker, we use the encoder-decoder Transformer (Vaswani et al., 2017) with 4layers, 8attention heads, andhidden states of size 256. The model parameters are optimized by Adam (Kingma & Ba, 2015), with batch size of 20,000 tokens, learning rate 0.0001, and gradient clipping 1.0 (Pascanu et al., 2013). For generation, we use beam search with beam size 10, and keep predictions with Levenshtein edit-distance less than 5 tokens from the input.