Scaling Neural Program Synthesis with Distribution-Based Search

Authors: Nathanaël Fijalkow, Guillaume Lagarde, Théo Matricon, Kevin Ellis, Pierre Ohlmann, Akarsh Nayan Potta6623-6630

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our contributions target the second stage of this pipeline, and we focus on theoretical analysis of sampling-based search algorithms, new search algorithms based on neurally-informed enumeration, and empirical evaluations showing that recent neural program synthesizers can compose well with our methods.
Researcher Affiliation Academia 1 CNRS, La BRI and Universit e de Bordeaux, France 2 The Alan Turing Institute of data science, United Kingdom 3 Cornell University, United States 4 University of Paris, France 5 Indian Institute of Technology Bombay, India
Pseudocode No We refer to the long version (Fijalkow et al. 2021) for a complete description of the algorithm with a pseudocode, a proof of Theorem 1, and a computational complexity analysis.
Open Source Code No The paper mentions that algorithms were "reimplemented and optimised in the codebase" but does not provide any explicit statement or link indicating that this codebase is open-source or publicly available.
Open Datasets Yes We extracted 218 problems from Dream Coder s dataset (Ellis et al. 2021). (The experiments can be easily replicated on Deep Coder s dataset (Balog et al. 2017) but we do not report on the results here.)
Dataset Splits No The paper mentions training a neural network on "synthetic data" and using a "filtered dataset of 137 tasks" for evaluation, but it does not specify how these tasks or the synthetic data were split into training, validation, and test sets for their experiments.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions using the "Alias method (Walker 1977)" for SQRT SAMPLING and describes the neural network architecture, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For each task we run every search algorithm on the PCFG induced by the neural predictions with a timeout of 100s and a maximum of 1M programs.