Learning Program Synthesis for Integer Sequences from Scratch

Authors: Thibault Gauthier, Josef Urban

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our system is tested on the On Line Encyclopedia of Integer Sequences. There, it discovers, on its own, solutions for 27987 sequences starting from basic operators and without human-written training examples. We describe the experiments in Section 7 and analyze some of the solutions in Section 8.
Researcher Affiliation Academia Thibault Gauthier, Josef Urban Czech Technical University in Prague, Czech Republic
Pseudocode Yes Figure 1: Pseudo-code of the self-learning procedure.
Open Source Code Yes The code for our project is publicly available in our repository (Gauthier and Urban 2022a).
Open Datasets Yes A compilation of such sequences is available in the On-Line Encyclopedia of Integer Sequences (OEIS) (Sloane 2007).
Dataset Splits No The paper does not provide specific dataset split information (e.g., percentages or sample counts for training, validation, and testing sets) for reproducibility in a traditional machine learning sense. The self-learning process generates training examples dynamically, rather than using fixed splits from a pre-defined dataset.
Hardware Specification Yes Each of these experiments is run on a server with 32 hyperthreading Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz, 256 GB of memory, and no GPU cards. The operating system of the server is Ubuntu 20.4, GNU/Linux 5.4.0-40-generic x86_64.
Software Dependencies No The paper mentions 'Intel MKL library' and 'Ubuntu 20.4' (operating system), but does not provide specific version numbers for software libraries or dependencies crucial for replicating the experiments, as required by the prompt.
Experiment Setup Yes Search parameters Each search phase is run in parallel on 16 cores targeting a total of 160 targets. On each target, a search is run for 10 minutes (2 minutes for side experiments cf. Section 7.2). ... Training parameters The TNN embedding dimension d is chosen to be 64. Each neural network block consists of two fully-connected layers with tanh activation functions.