Neuro-Symbolic Program Synthesis

Authors: Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, Pushmeet Kohli

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that the R3NN model is not only able to construct programs from new input-output examples, but it is also able to construct new programs for tasks that it had never observed before during training.Our evaluation shows that NSPS is not only able to construct programs for known tasks from new input-output examples, but it is also able to construct completely new programs that it had not observed during training. Specifically, the proposed system is able to synthesize string transformation programs for 63% of tasks that it had not observed at training time, and for 94% of tasks when 100 program samples are taken from the model. Moreover, our system is able to learn 38% of 238 real-world Flash Fill benchmarks.
Researcher Affiliation Collaboration Emilio Parisotto1,2, Abdel-rahman Mohamed1, Rishabh Singh1, Lihong Li1, Dengyong Zhou1, Pushmeet Kohli1 1Microsoft Research, USA 2Carnegie Mellon University, USA
Pseudocode No The paper describes the system architecture and algorithms using textual descriptions and figures, but no formal pseudocode blocks are present.
Open Source Code No The paper does not contain any explicit statement about releasing the source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets No In order to evaluate and compare variants of the previously described models, we generate a dataset randomly from the DSL.We also evaluate our learnt models on 238 real-world Flash Fill benchmarks obtained from the Microsoft Excel team and online help-forums.While it mentions 'Flash Fill DSL (Gulwani, 2011; Gulwani et al., 2012)', it doesn't provide a direct link to these specific 238 benchmarks.
Dataset Splits Yes To do so, we first enumerate all possible programs under the DSL up to a specific number of instructions, which are then partitioned into training, validation and test sets.
Hardware Specification No Because of limited GPU memory, the I/O encoder models can quickly run out of memory. This is the only mention of hardware, and it's too vague for specific hardware details.
Software Dependencies No Network weights used the default torch initializations.We used ADAM (Kingma & Ba, 2014) to optimize the networks with a learning rate of 0.001. No version numbers for PyTorch (assuming 'torch') or any other library are mentioned.
Experiment Setup Yes For training the R3NN, two hyperparameters that were crucial for stabilizing training were the use of hyperbolic tangent activation functions in both R3NN (other activations such as Re LU more consistently diverged during our initial experiments) and cross-correlation I/O encoders and the use of minibatches of length 8. Additionally, for all results, the program tree generation is conditioned on a set of 10 input/output string pairs. We used ADAM (Kingma & Ba, 2014) to optimize the networks with a learning rate of 0.001. Network weights used the default torch initializations.batch sizes to between 8-12.