reproducibilityindex.ai

Neuro-Symbolic Program Synthesis

Authors: Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, Pushmeet Kohli

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that the R3NN model is not only able to construct programs from new input-output examples, but it is also able to construct new programs for tasks that it had never observed before during training.Our evaluation shows that NSPS is not only able to construct programs for known tasks from new input-output examples, but it is also able to construct completely new programs that it had not observed during training. Speciﬁcally, the proposed system is able to synthesize string transformation programs for 63% of tasks that it had not observed at training time, and for 94% of tasks when 100 program samples are taken from the model. Moreover, our system is able to learn 38% of 238 real-world Flash Fill benchmarks.
Researcher Affiliation	Collaboration	Emilio Parisotto1,2, Abdel-rahman Mohamed1, Rishabh Singh1, Lihong Li1, Dengyong Zhou1, Pushmeet Kohli1 1Microsoft Research, USA 2Carnegie Mellon University, USA
Pseudocode	No	The paper describes the system architecture and algorithms using textual descriptions and figures, but no formal pseudocode blocks are present.
Open Source Code	No	The paper does not contain any explicit statement about releasing the source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	No	In order to evaluate and compare variants of the previously described models, we generate a dataset randomly from the DSL.We also evaluate our learnt models on 238 real-world Flash Fill benchmarks obtained from the Microsoft Excel team and online help-forums.While it mentions 'Flash Fill DSL (Gulwani, 2011; Gulwani et al., 2012)', it doesn't provide a direct link to these specific 238 benchmarks.
Dataset Splits	Yes	To do so, we ﬁrst enumerate all possible programs under the DSL up to a speciﬁc number of instructions, which are then partitioned into training, validation and test sets.
Hardware Specification	No	Because of limited GPU memory, the I/O encoder models can quickly run out of memory. This is the only mention of hardware, and it's too vague for specific hardware details.
Software Dependencies	No	Network weights used the default torch initializations.We used ADAM (Kingma & Ba, 2014) to optimize the networks with a learning rate of 0.001. No version numbers for PyTorch (assuming 'torch') or any other library are mentioned.
Experiment Setup	Yes	For training the R3NN, two hyperparameters that were crucial for stabilizing training were the use of hyperbolic tangent activation functions in both R3NN (other activations such as Re LU more consistently diverged during our initial experiments) and cross-correlation I/O encoders and the use of minibatches of length 8. Additionally, for all results, the program tree generation is conditioned on a set of 10 input/output string pairs. We used ADAM (Kingma & Ba, 2014) to optimize the networks with a learning rate of 0.001. Network weights used the default torch initializations.batch sizes to between 8-12.