reproducibilityindex.ai

Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis

Authors: Rudy Bunel, Matthew Hausknecht, Jacob Devlin, Rishabh Singh, Pushmeet Kohli

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 EXPERIMENTS
Researcher Affiliation	Collaboration	Rudy Bunel University of Oxford rudy@robots.ox.ac.uk Matthew Hausknecht Microsoft Research matthew.hausknecht@microsoft.com Jacob Devlin Google jacobdevlin@google.com Rishabh Singh Microsoft Research risin@microsoft.com Pushmeet Kohli Deepmind pushmeet@google.com
Pseudocode	No	The paper provides a DSL specification but no structured pseudocode or algorithm blocks.
Open Source Code	No	Code and data will be made available.
Open Datasets	Yes	The Karel DSL was previously used by Devlin et al. (2017a) to study the relative perfomances of a range of methods depending on the available amount of data.
Dataset Splits	Yes	5000 programs are not used for training, and get split out between a validation set and a test set.
Hardware Specification	No	The paper does not provide specific details on the hardware used for experiments, such as GPU or CPU models.
Software Dependencies	No	Our models are implemented using the Pytorch framework (pyt).
Experiment Setup	Yes	The decoders are two-layer LSTM with a hidden size of 256. Tokens of the DSL are embedded to a 256 dimensional vector... All training is performed using the Adam optimizer, with a learning rate of 10^-4. Supervised training used a batch size of 128 and RL methods used a batch size of 16. We used 100 rollouts per samples for the Reinforce method and a beam size of 64 for methods based on the beam search. The value of C used for the methods computing a loss on bags of programs was 5.