CrossBeam: Learning to Search in Bottom-Up Program Synthesis
Authors: Kensen Shi, Hanjun Dai, Kevin Ellis, Charles Sutton
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate CROSSBEAM in two very different domains, string manipulation and logic programming. We observe that CROSSBEAM learns to search efficiently, exploring much smaller portions of the program space compared to the state-of-the-art. 4 EXPERIMENTS Our experiments compare CROSSBEAM to state-of-the-art methods in two very different domains, string manipulation and inductive logic programming. |
| Researcher Affiliation | Collaboration | Kensen Shi Google Research kshi@google.com Hanjun Dai Google Research hadai@google.com Kevin Ellis Cornell University kellis@cornell.edu Charles Sutton Google Research charlessutton@google.com |
| Pseudocode | Yes | Algorithm 1 The CROSSBEAM algorithm, with training data generation in blue |
| Open Source Code | Yes | 1https://github.com/google-research/crossbeam |
| Open Datasets | Yes | For the string manipulation domain, we use the same DSL, success criteria (satisfying the I/O examples), training task generation procedure, and test tasks as used in BUSTLE (Odena et al., 2021). We run CROSSBEAM on both sets of test tasks used to evaluate BUSTLE, with a search budget of 50,000 candidate programs. Our synthetic test set contains tasks of varying size, i.e., the minimum number of nodes in the expression tree of a solution. For each size in {5, . . . , 19}, we randomly select 50 tasks among all possible tasks of that size, excluding those appearing in the training dataset.2 Our handcrafted test set, inspired by Peano arithmetic, comprises 30 common mathematical relationships... |
| Dataset Splits | Yes | After every 10K training steps, we evaluate on synthetic validation tasks to choose the best model. For the string manipulation domain, we use the same DSL, success criteria (satisfying the I/O examples), training task generation procedure, and test tasks as used in BUSTLE (Odena et al., 2021). |
| Hardware Specification | Yes | We train CROSSBEAM with a distributed synchronized Adam optimizer on 8 V100 GPUs, with gradient accumulation of 4 for an effective batch size of 32. |
| Software Dependencies | No | The paper mentions 'distributed synchronized Adam optimizer', 'LSTM', 'MLP', and 'GREAT Transformer' but does not specify versions of programming languages, libraries, or frameworks (e.g., Python, TensorFlow, PyTorch versions). |
| Experiment Setup | Yes | We train CROSSBEAM with a distributed synchronized Adam optimizer on 8 V100 GPUs, with gradient accumulation of 4 for an effective batch size of 32. We generate 10M training tasks for string manipulation and 1M for logic programming. The models train for at most 500K steps or 5 days, whichever limit is reached first. We use K = 10 and learning rate 1 × 10−4 during training. |