Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples
Authors: Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, Sumit Gulwani
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate our NGDS algorithm over the string manipulation domain with a DSL given by Figure 2; see Figure 1 for an example task. We evaluate NGDS, its ablations, and baseline techniques on two key metrics: (a) generalization accuracy on unseen inputs, (b) synthesis time. Dataset. We use a dataset of 375 tasks collected from real-world customer string manipulation problems, split into 65% training, 15% validation, and 20% test data. |
| Researcher Affiliation | Collaboration | Ashwin K. Vijayakumar & Dhruv Batra School of Interactive Computing Georgia Tech Atlanta, GA 30308, USA {ashwinkv,dbatra}@gatech.edu Abhishek Mohta & Prateek Jain Microsoft Research India Bengaluru, Karnataka 560001, India {t-abmoht,prajain}@microsoft.com Oleksandr Polozov & Sumit Gulwani Microsoft Research Redmond Redmond, WA 98052, USA {polozov,sumitg}@microsoft.com |
| Pseudocode | Yes | Figure 5: The controllers for guiding the search process to construct a most generalizable ϕ-satisfying program set S of size k given the f-predicted best scores s1, ..., sn of the productions F1, ... , Fn. Figure 6: Neural-guided deductive search over L, parameterized with a branch selection controller C. |
| Open Source Code | No | The paper does not explicitly state that the source code for their methodology is made publicly available or provide a link to a code repository. |
| Open Datasets | No | We use a dataset of 375 tasks collected from real-world customer string manipulation problems, split into 65% training, 15% validation, and 20% test data. Some of the common applications found in our dataset include date/time formatting, manipulating addresses, modifying names, automatically generating email IDs, etc. Each task contains about 10 inputs, of which only one is provided as the spec to the synthesis system, mimicking industrial applications. The remaining unseen examples are used to evaluate generalization performance of the synthesized programs. After running synthesis of top-1 programs with PROSE on all training tasks, we have collected a dataset of 400,000 intermediate search decisions, i.e. triples production Γ, spec ϕ, a posteriori best score h(P, ϕ) . |
| Dataset Splits | Yes | We use a dataset of 375 tasks collected from real-world customer string manipulation problems, split into 65% training, 15% validation, and 20% test data. |
| Hardware Specification | Yes | We run all the methods on the same machine with 2.3 GHz Intel Xeon processor, 64GB of RAM, and Windows Server 2016. |
| Software Dependencies | No | Finally, we train all our LSTM-based models with CNTK (Seide & Agarwal, 2016) using Adam (Kingma & Ba, 2014) with a learning rate of 10 2 and a batch size of 32, using early stopping on the validation loss to select the best performing model (thus, 100-600 epochs). |
| Experiment Setup | Yes | Finally, we train all our LSTM-based models with CNTK (Seide & Agarwal, 2016) using Adam (Kingma & Ba, 2014) with a learning rate of 10 2 and a batch size of 32, using early stopping on the validation loss to select the best performing model (thus, 100-600 epochs). |