CrossBeam: Learning to Search in Bottom-Up Program Synthesis

Authors: Kensen Shi, Hanjun Dai, Kevin Ellis, Charles Sutton

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate CROSSBEAM in two very different domains, string manipulation and logic programming. We observe that CROSSBEAM learns to search efficiently, exploring much smaller portions of the program space compared to the state-of-the-art. 4 EXPERIMENTS Our experiments compare CROSSBEAM to state-of-the-art methods in two very different domains, string manipulation and inductive logic programming.
Researcher Affiliation Collaboration Kensen Shi Google Research kshi@google.com Hanjun Dai Google Research hadai@google.com Kevin Ellis Cornell University kellis@cornell.edu Charles Sutton Google Research charlessutton@google.com
Pseudocode Yes Algorithm 1 The CROSSBEAM algorithm, with training data generation in blue
Open Source Code Yes 1https://github.com/google-research/crossbeam
Open Datasets Yes For the string manipulation domain, we use the same DSL, success criteria (satisfying the I/O examples), training task generation procedure, and test tasks as used in BUSTLE (Odena et al., 2021). We run CROSSBEAM on both sets of test tasks used to evaluate BUSTLE, with a search budget of 50,000 candidate programs. Our synthetic test set contains tasks of varying size, i.e., the minimum number of nodes in the expression tree of a solution. For each size in {5, . . . , 19}, we randomly select 50 tasks among all possible tasks of that size, excluding those appearing in the training dataset.2 Our handcrafted test set, inspired by Peano arithmetic, comprises 30 common mathematical relationships...
Dataset Splits Yes After every 10K training steps, we evaluate on synthetic validation tasks to choose the best model. For the string manipulation domain, we use the same DSL, success criteria (satisfying the I/O examples), training task generation procedure, and test tasks as used in BUSTLE (Odena et al., 2021).
Hardware Specification Yes We train CROSSBEAM with a distributed synchronized Adam optimizer on 8 V100 GPUs, with gradient accumulation of 4 for an effective batch size of 32.
Software Dependencies No The paper mentions 'distributed synchronized Adam optimizer', 'LSTM', 'MLP', and 'GREAT Transformer' but does not specify versions of programming languages, libraries, or frameworks (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup Yes We train CROSSBEAM with a distributed synchronized Adam optimizer on 8 V100 GPUs, with gradient accumulation of 4 for an effective batch size of 32. We generate 10M training tasks for string manipulation and 1M for logic programming. The models train for at most 500K steps or 5 days, whichever limit is reached first. We use K = 10 and learning rate 1 × 10−4 during training.