Learning Program Synthesis for Integer Sequences from Scratch
Authors: Thibault Gauthier, Josef Urban
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our system is tested on the On Line Encyclopedia of Integer Sequences. There, it discovers, on its own, solutions for 27987 sequences starting from basic operators and without human-written training examples. We describe the experiments in Section 7 and analyze some of the solutions in Section 8. |
| Researcher Affiliation | Academia | Thibault Gauthier, Josef Urban Czech Technical University in Prague, Czech Republic |
| Pseudocode | Yes | Figure 1: Pseudo-code of the self-learning procedure. |
| Open Source Code | Yes | The code for our project is publicly available in our repository (Gauthier and Urban 2022a). |
| Open Datasets | Yes | A compilation of such sequences is available in the On-Line Encyclopedia of Integer Sequences (OEIS) (Sloane 2007). |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages or sample counts for training, validation, and testing sets) for reproducibility in a traditional machine learning sense. The self-learning process generates training examples dynamically, rather than using fixed splits from a pre-defined dataset. |
| Hardware Specification | Yes | Each of these experiments is run on a server with 32 hyperthreading Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz, 256 GB of memory, and no GPU cards. The operating system of the server is Ubuntu 20.4, GNU/Linux 5.4.0-40-generic x86_64. |
| Software Dependencies | No | The paper mentions 'Intel MKL library' and 'Ubuntu 20.4' (operating system), but does not provide specific version numbers for software libraries or dependencies crucial for replicating the experiments, as required by the prompt. |
| Experiment Setup | Yes | Search parameters Each search phase is run in parallel on 16 cores targeting a total of 160 targets. On each target, a search is run for 10 minutes (2 minutes for side experiments cf. Section 7.2). ... Training parameters The TNN embedding dimension d is chosen to be 64. Each neural network block consists of two fully-connected layers with tanh activation functions. |