Relational Program Synthesis with Numerical Reasoning
Authors: Céline Hocquette, Andrew Cropper
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on four diverse domains, including game playing and program synthesis, show that our approach can (i) learn programs with numerical values from linear arithmetical reasoning, and (ii) outperform existing approaches in terms of predictive accuracies and learning times. |
| Researcher Affiliation | Academia | University of Oxford celine.hocquette@cs.ox.ac.uk, andrew.cropper@cs.ox.ac.uk |
| Pseudocode | No | The paper describes the steps of its approach, such as program search and numerical search, but it does not present these steps in a formal pseudocode block or an explicitly labeled algorithm figure. |
| Open Source Code | Yes | The experimental code and data are available at https://github.com/ celinehocquette/numsynth-aaai23. |
| Open Datasets | Yes | The experimental code and data are available at https://github.com/ celinehocquette/numsynth-aaai23. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test splits (e.g., percentages or counts) or refer to a specific validation set. It focuses on training and testing. |
| Hardware Specification | Yes | We use an 8-Core 3.2 GHz Apple M1 and a single CPU. |
| Software Dependencies | No | The paper mentions POPPER (with version 2.0.0 in a footnote), Z3 (with a citation year 2008), Clingo (with a citation year 2014), and Prolog, but it does not provide specific version numbers for multiple key software components to fully replicate the environment. |
| Experiment Setup | No | The paper mentions a timeout of 10 minutes per task, measuring mean and standard error over 10 trials, and the hardware used. However, it does not specify concrete hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed system-level training configurations beyond the hardware and time limit. |