Stepping Stones to Inductive Synthesis of Low-Level Looping Programs
Authors: Christopher D. Rosin2362-2370
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present MAKESPEARE, a simple delayed-acceptance hillclimbing method that synthesizes low-level looping programs from input/output examples. ... The method performs well on a set of established benchmarks, and succeeds on the previously unsolved Collatz Numbers program synthesis problem. Additional benchmarks include the problem of rapidly sorting integer arrays, in which we observe the emergence of comb sort (a Shell sort variant that is empirically fast). ... Table 4 shows results (see Sec. 5.7 for TIS-100). |
| Researcher Affiliation | Industry | Christopher D. Rosin Parity Computing, Inc. San Diego, California, USA christopher.rosin@gmail.com |
| Pseudocode | Yes | Figure 1: Delayed Acceptance Hillclimbing. When applied to program synthesis, N is the period s best program, Y is search s current program, and Z is the new variant to eval. ... Figure 2: Local Search Operators ... Figure 3: Eval Scoring Function (see Sec 3.3 for TIS-100 detail) |
| Open Source Code | Yes | Implementation Code and data are available for download.3 https://github.com/Christopher Rosin/MAKESPEARE |
| Open Datasets | Yes | We use the explicit instances made available for download (Helmuth and Spector 2015a) rather than the protocol for generating new instances (Helmuth and Spector 2015c). ... Helmuth, T., and Spector, L. 2015a. General program synthesis training and test examples CSVs. github.com/thelmuth/Program-Synthesis-Benchmark-Data. |
| Dataset Splits | No | The paper describes the use of training and test sets but does not explicitly mention a separate 'validation' set or specific 'validation split' details for reproducibility. |
| Hardware Specification | Yes | The full set of Delayed Acceptance experiments below finish in 10 days, using 6 machines, each with a 4-core Intel i7-6700 CPU. |
| Software Dependencies | No | The paper mentions 'Dyn ASM (Pall 2017)' but does not provide a specific version number for this or any other software dependency. |
| Experiment Setup | Yes | With swap P {0, 0.1}, double P {0, 0.1, 0.5, 0.9}, copy P {0, 0.1, 0.5}, and I {3k, 10k, 25k, 75k, 150k}, a grid search was run on the preliminary benchmarks in Table 3. For each combination, Delayed Acceptance runs 100 times with max 300k evaluated programs per run. We find the parameters in Fig. 2 with I=75k have the most total runs that generalize perfectly, solving all 5 preliminary benchmarks. |