Generalized Planning for the Abstraction and Reasoning Corpus

Authors: Chao Lei, Nir Lipovetzky, Krista A. Ehinger

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that GPAR outperforms the state-of-the-art solvers on the object-centric tasks of the ARC, showing the effectiveness of GP and the expressiveness of PDDL to model ARC problems.
Researcher Affiliation Academia Chao Lei, Nir Lipovetzky, Krista A. Ehinger School of Computing and Information Systems, The University of Melbourne, Australia clei1@student.unimelb.edu.au, {nir.lipovetzky, kris.ehinger}@unimelb.edu.au
Pseudocode Yes The upper part of Figure 3 illustrates a planning program discovered by our solver that updates the color of any size-1 node (a collection of pixels) to black using two pointers, no and co, to iterate over node and color objects. ... Figure 5 illustrates the planning process of a planning program with a complex logic: the planning action Extend Node Direction is executed only when the first tested attribute is false (line 0) and the second tested spatial relation is true (line 2), using three pointers in total.
Open Source Code Yes Code is available at github.com/you68681/GPAR.
Open Datasets Yes The Abstraction and Reasoning Corpus (ARC) introduced by Chollet (2019)... As a benchmark, we use the subset of 160 object-centric ARC tasks introduced by Xu, Khalil, and Sanner (2023).
Dataset Splits Yes ARC comprises 1000 unique tasks where each task consists of a small set (typically three) of input-output image pairs for training, and generally one or occasionally multiple test pairs for evaluation (Figure 1). Each image is a 2D grid of pixels with 10 possible colors. ARC tasks require inferring the underlying rules or procedures from a few examples based on core knowledge priors including objectness, goal-directedness, numbers and counting, topology and geometry.
Hardware Specification No All experiments were conducted on a cloud computer with clock speeds of 2.00 GHz Xeon processors. This provides general information but lacks specific model numbers for the Xeon processors, amount of RAM, or details about GPUs, which are typically required for precise hardware replication.
Software Dependencies No The paper mentions using PDDL and improving the PGP(v) solver, but it does not specify version numbers for PGP(v) or any other ancillary software, libraries, or programming languages used in the experiments.
Experiment Setup Yes In GPAR, PGP(v) takes n, v, and Z as parameters. The number of program lines n ranges from 3 to 10 where the valid Π configuration for n = 3 is v = 1 since each instruction included in Π with n = 3 can only appear once, such as a test action, a goto instruction and a planning action. For n = 4, reasonable configurations include v = 1 and v = 2 since a planning action can appear twice. For n > 4, the value of v ranges from 1 to 3. All the possible combinations of Z are presented in Table 4, where only object types NODE, COLOR, and M-DIRECTION are referenced since they are typical specifications of parameters in the design action schemes. ... For each ARC task, possible combinations are executed in order of increasing complexity, starting from lower values of n and v, fewer pointers, and simpler abstractions (e.g., 4-connected are considered before 8-connected abstractions) with a time limit of 1800s for each.