reproducibilityindex.ai

Neural Program Generation Modulo Static Analysis

Authors: Rohan Mukherjee, Yeming Wen, Dipak Chaudhari, Thomas Reps, Swarat Chaudhuri, Christopher Jermaine

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the approach substantially outperforms state-of-the-art transformers and a model that explicitly tries to learn program semantics on this task, both in terms of producing programs free of basic semantic errors and in terms of syntactically matching the ground truth. We evaluate our approach in the task of generating the entire body of a Java method given the rest of the class in which the method occurs.
Researcher Affiliation	Academia	Rohan Mukherjee Rice University Dipak Chaudhari Thomas W. Reps University of Wisconsin Swarat Chaudhuri Chris Jermaine Rice University
Pseudocode	Yes	Algorithm 1: Gen(S, A(S)#, Sym So Far, Z)
Open Source Code	Yes	Our implementation is available at https://github.com/rohanmukh/nsg.
Open Datasets	Yes	Data. To test our hypothesis, we used a curated, deduplicated set of Java source-code ﬁles [26]. The training data is composed of 1.5M Java methods and associated contexts, extracted from the public GitHub repository.
Dataset Splits	Yes	The validation set is composed of 10k Java methods and associated contexts, and the test set is composed of 5k Java methods and associated contexts.
Hardware Specification	Yes	We train the model on a cluster with NVIDIA A100 GPUs.
Software Dependencies	No	We trained our framework on top of Tensorﬂow [1]. Our model is implemented in Python and makes use of the Tensorﬂow and PyTorch libraries. The paper mentions software libraries but does not specify their version numbers.
Experiment Setup	Yes	Hyperparameters are set using a random search. The hyperparameters for the NSG model are given in Table 4.