Neural Program Generation Modulo Static Analysis
Authors: Rohan Mukherjee, Yeming Wen, Dipak Chaudhari, Thomas Reps, Swarat Chaudhuri, Christopher Jermaine
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the approach substantially outperforms state-of-the-art transformers and a model that explicitly tries to learn program semantics on this task, both in terms of producing programs free of basic semantic errors and in terms of syntactically matching the ground truth. We evaluate our approach in the task of generating the entire body of a Java method given the rest of the class in which the method occurs. |
| Researcher Affiliation | Academia | Rohan Mukherjee Rice University Dipak Chaudhari Thomas W. Reps University of Wisconsin Swarat Chaudhuri Chris Jermaine Rice University |
| Pseudocode | Yes | Algorithm 1: Gen(S, A(S)#, Sym So Far, Z) |
| Open Source Code | Yes | Our implementation is available at https://github.com/rohanmukh/nsg. |
| Open Datasets | Yes | Data. To test our hypothesis, we used a curated, deduplicated set of Java source-code files [26]. The training data is composed of 1.5M Java methods and associated contexts, extracted from the public GitHub repository. |
| Dataset Splits | Yes | The validation set is composed of 10k Java methods and associated contexts, and the test set is composed of 5k Java methods and associated contexts. |
| Hardware Specification | Yes | We train the model on a cluster with NVIDIA A100 GPUs. |
| Software Dependencies | No | We trained our framework on top of Tensorflow [1]. Our model is implemented in Python and makes use of the Tensorflow and PyTorch libraries. The paper mentions software libraries but does not specify their version numbers. |
| Experiment Setup | Yes | Hyperparameters are set using a random search. The hyperparameters for the NSG model are given in Table 4. |