Learning Domain-Independent Heuristics for Grounded and Lifted Planning
Authors: Dillon Z. Chen, Sylvie Thiébaux, Felipe Trevizan
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that our heuristics generalise to much larger problems than those in the training set, vastly surpassing STRIPS-HGN heuristics. We then conduct two sets of experiments to complement our theory and evaluate the effectiveness of learned heuristics. |
| Researcher Affiliation | Academia | Dillon Z. Chen1,2, Sylvie Thi ebaux1,2, Felipe Trevizan1 1School of Computing, The Australian National University 2LAAS-CNRS, Universit e de Toulouse |
| Pseudocode | No | The paper provides mathematical definitions and descriptions of models but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at (Chen, Thi ebaux, and Trevizan 2023a). |
| Open Datasets | Yes | For domain-dependent heuristic learning, we train 5 models for each domain on optimal plans with problems specified in Tab. 1. ... For domain-independent heuristic learning, we consider the problems and domains of the 1998 to 2018 IPC dataset, excluding the domains in Tab. 1. We train 5 models using optimal plans generated by scorpion (Seipp, Keller, and Helmert 2020) with a 30min cutoff time and unit costs. |
| Dataset Splits | Yes | Table 1: Problem training/validation/testing splits with sizes and number of tasks per domain. ... We schedule our learning rate by extracting 25% of the training data and reducing the learning rate by a factor of 10 if the loss on this data subset did not decrease in the last 10 epochs. Training is stopped when the learning rate becomes less than 10 5 on this subset, which often occurs within a few minutes. Following a similar method to Ferber, Helmert, and Hoffmann (2020), we select the best model for both settings by choosing the model which solves the most problems in the validation set (Tab. 1). |
| Hardware Specification | Yes | GOOSE is run with a single NVIDIA Ge Force RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions several software components like 'Adam optimiser' (Kingma and Ba 2015), 'Fast Downward' (Helmert 2006), and 'RGCN' (Schlichtkrull et al. 2018) but does not provide specific version numbers for these dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | In both settings, a model is trained with the Adam optimiser (Kingma and Ba 2015), batch size 16, initial learning rate of 0.001 and MSE loss. We schedule our learning rate by extracting 25% of the training data and reducing the learning rate by a factor of 10 if the loss on this data subset did not decrease in the last 10 epochs. Training is stopped when the learning rate becomes less than 10 5 on this subset, which often occurs within a few minutes. ... We choose a hidden dimension of 64 with 8 message passing layers and the mean aggregator. |