Improving Neural Program Synthesis with Inferred Execution Traces
Authors: Eui Chul Shin, Illia Polosukhin, Dawn Song
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical results show that this modiļ¬cation leads to state-of-the-art results on the Karel [Pattis, 1981] program synthesis task, improving upon Bunel et al. [2018] from 77.12% to 81.3% accuracy. |
| Researcher Affiliation | Collaboration | Richard Shin UC Berkeley ricshin@cs.berkeley.edu Illia Polosukhin NEAR Protocol illia@nearprotocol.com Dawn Song UC Berkeley dawnsong@cs.berkeley.edu |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found. |
| Open Source Code | No | The paper links to a dataset: "To train and test our models, we used the same dataset as Bunel et al. [2018], from https://bit.ly/karel-dataset." However, there is no explicit statement or link providing the open-source code for the methodology described in the paper. |
| Open Datasets | Yes | To train and test our models, we used the same dataset as Bunel et al. [2018], from https://bit.ly/karel-dataset. |
| Dataset Splits | Yes | The training dataset consists of 1,116,854 entries, and the test dataset contains 2,500 entries. Each entry in the dataset contains a Karel program and 6 input/output pairs which satisfy that program. For training the I/O TRACE model, we used all 6 input/output pairs within each entry for a total of 6,701,124 training traces. For training the TRACE CODE model (and our reimplementation of the I/O CODE model from Bunel et al. [2018]), we randomly sample 5 out of the 6 input/output examples (and corresponding traces) each time we sample an entry from the training data. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments were provided in the paper. |
| Software Dependencies | No | The paper mentions deep learning components and training methods (e.g., "convolutional neural network", "two-layer LSTM decoder", "SGD with gradient clipping", "Adam") but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | SGD with gradient clipping worked better for training the models than Adam. For all of the evaluations of TRACE CODE we used beam search with size 50. |