Generalized Planning for the Abstraction and Reasoning Corpus
Authors: Chao Lei, Nir Lipovetzky, Krista A. Ehinger
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that GPAR outperforms the state-of-the-art solvers on the object-centric tasks of the ARC, showing the effectiveness of GP and the expressiveness of PDDL to model ARC problems. |
| Researcher Affiliation | Academia | Chao Lei, Nir Lipovetzky, Krista A. Ehinger School of Computing and Information Systems, The University of Melbourne, Australia clei1@student.unimelb.edu.au, {nir.lipovetzky, kris.ehinger}@unimelb.edu.au |
| Pseudocode | Yes | The upper part of Figure 3 illustrates a planning program discovered by our solver that updates the color of any size-1 node (a collection of pixels) to black using two pointers, no and co, to iterate over node and color objects. ... Figure 5 illustrates the planning process of a planning program with a complex logic: the planning action Extend Node Direction is executed only when the first tested attribute is false (line 0) and the second tested spatial relation is true (line 2), using three pointers in total. |
| Open Source Code | Yes | Code is available at github.com/you68681/GPAR. |
| Open Datasets | Yes | The Abstraction and Reasoning Corpus (ARC) introduced by Chollet (2019)... As a benchmark, we use the subset of 160 object-centric ARC tasks introduced by Xu, Khalil, and Sanner (2023). |
| Dataset Splits | Yes | ARC comprises 1000 unique tasks where each task consists of a small set (typically three) of input-output image pairs for training, and generally one or occasionally multiple test pairs for evaluation (Figure 1). Each image is a 2D grid of pixels with 10 possible colors. ARC tasks require inferring the underlying rules or procedures from a few examples based on core knowledge priors including objectness, goal-directedness, numbers and counting, topology and geometry. |
| Hardware Specification | No | All experiments were conducted on a cloud computer with clock speeds of 2.00 GHz Xeon processors. This provides general information but lacks specific model numbers for the Xeon processors, amount of RAM, or details about GPUs, which are typically required for precise hardware replication. |
| Software Dependencies | No | The paper mentions using PDDL and improving the PGP(v) solver, but it does not specify version numbers for PGP(v) or any other ancillary software, libraries, or programming languages used in the experiments. |
| Experiment Setup | Yes | In GPAR, PGP(v) takes n, v, and Z as parameters. The number of program lines n ranges from 3 to 10 where the valid Π configuration for n = 3 is v = 1 since each instruction included in Π with n = 3 can only appear once, such as a test action, a goto instruction and a planning action. For n = 4, reasonable configurations include v = 1 and v = 2 since a planning action can appear twice. For n > 4, the value of v ranges from 1 to 3. All the possible combinations of Z are presented in Table 4, where only object types NODE, COLOR, and M-DIRECTION are referenced since they are typical specifications of parameters in the design action schemes. ... For each ARC task, possible combinations are executed in order of increasing complexity, starting from lower values of n and v, fewer pointers, and simpler abstractions (e.g., 4-connected are considered before 8-connected abstractions) with a time limit of 1800s for each. |