Guiding GBFS through Learned Pairwise Rankings

Authors: Mingyu Hao, Felipe Trevizan, Sylvie Thiébaux, Patrick Ferber, Jörg Hoffmann

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on the domains of the latest planning competition learning track show that our approach substantially improves the coverage of the underlying neural network models without degrading plan quality.
Researcher Affiliation Academia 1Australian National University, Australia 2LAAS-CNRS, Universit e de Toulouse, France 3University of Basel, Switzerland 4Saarland University, Germany
Pseudocode Yes Algorithm 1 GBFS Algorithm
Open Source Code Yes The source code can be found at [Hao et al., 2024a]. [Hao et al., 2024a] Mingyu Hao, Sylvie Thi ebaux, and Felipe Trevizan. Source Code for Guiding GBFS Through Learned Pairwise Rankings . https://doi.org/10.5281/ zenodo.11107790, 2024.
Open Datasets Yes We use the 10 domains from the 2023 International Planning Competition Learning Track (IPC23-LT) [Segovia and Seipp, 2023]. [Segovia and Seipp, 2023] Javier Segovia and Jendrik Seipp. Benchmarking Repository of IPC 2023 Learning Track. https://github.com/ipc2023-learning/benchmarks, 2023.
Dataset Splits Yes The training problems are randomly split between the training set (90%) and the validation set (10%).
Hardware Specification Yes All experiments are run on an Intel Xeon 2.1GHz CPU and NVIDIA A6000 GPU with 64GB of memory.
Software Dependencies No The paper mentions tools like 'Fast-Downward' and 'Scorpion planner' and optimizers like 'Adam optimiser' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The same model parameters are used across our experiments: 4 message-passing layers and hidden dimension m = 64. For our framework, we use the output from the second-last hidden layer of the underlying regression model as the embedding vector, thus embs R64. We use the Adam optimiser and an initial learning rate of 10 3. The learning rate is reduced by a factor of 10 if the accuracy on the validation set does not improve for 10 consecutive epochs. Training is stopped when the learning rate reaches 10 6 or after 500 epochs.