Neural Program Meta-Induction
Authors: Jacob Devlin, Rudy R. Bunel, Rishabh Singh, Matthew Hausknecht, Pushmeet Kohli
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using an extensive experimental evaluation on the Karel benchmark, we demonstrate that our proposals dramatically outperform the baseline induction method that does not use knowledge transfer. |
| Researcher Affiliation | Collaboration | Jacob Devlin Google jacobdevlin@google.com Rudy Bunel University of Oxford rudy@robots.ox.ac.uk Rishabh Singh Microsoft Research risin@microsoft.com Matthew Hausknecht Microsoft Research mahauskn@microsoft.com Pushmeet Kohli Deep Mind pushmeet@google.com |
| Pseudocode | No | The paper describes its models and methods using textual descriptions and architectural diagrams (Figure 2), but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository or mention code in supplementary materials for the described methodology. |
| Open Datasets | No | The paper states that the data was generated by the authors: “All training, validation, and test programs were generated by treating the Karel DSL as a probabilistic context free grammar and performing top-down expansion with uniform probability at each node.” No link or explicit statement about the public availability of this generated dataset is provided. |
| Dataset Splits | No | The paper mentions the use of “training, validation, and test programs” and states “The dropout, learning rate, and batch size were optimized with grid search for each value of n using a separate set of validation tasks.” However, it does not provide specific split percentages, sample counts, or explicit methodology for how these splits were created or their sizes. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments. It only mentions that “Training was performed using SGD + momentum and gradient clipping using an in-house toolkit.” |
| Software Dependencies | No | The paper mentions that “Training was performed using SGD + momentum and gradient clipping using an in-house toolkit,” but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific libraries). |
| Experiment Setup | Yes | The input encoder is a 3-layer CNN with a FC+relu layer on top. The output decoder is a 1-layer LSTM. For the META model, the task encoder uses 1-layer CNN to encode the input and output for a single example, which are concatenated on the feature map dimension and fed through a 6-layer CNN with a FC+relu layer on top. Multiple I/O examples were combined with max-pooling on the final vector. All convolutional layers use a 3 3 kernel with a 64-dimensional feature map. The fully-connected and LSTM are 1024-dimensional. Different model sizes are explored later in this section. The dropout, learning rate, and batch size were optimized with grid search for each value of n using a separate set of validation tasks. Training was performed using SGD + momentum and gradient clipping using an in-house toolkit. |