On Using Admissible Bounds for Learning Forward Search Heuristics

Authors: Carlos Núñez-Molina, Masataro Asai, Pablo Mesejo, Juan Fernandez-Olivares

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments where both MSE and our novel loss function are applied to learning a heuristic from optimal plan costs. Results show that our proposed method converges faster during training and yields better heuristics. and 4 Experimental Evaluation We evaluate the effectiveness of our new loss function under the domain-specific generalization setting, where the learned heuristic function is required to generalize across different problems of a single domain.
Researcher Affiliation Collaboration Carlos N u nez-Molina1 , Masataro Asai2 , Pablo Mesejo1 and Juan Fern andez-Olivares1 1University of Granada, Spain 2MIT-IBM Watson AI Lab, USA
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes 1Our full code and data can be found in github.com/pddl-heuristic-learning/pddlsl.
Open Datasets Yes We trained our system on four classical planning domains: blocksworld-4ops, ferry, gripper, and visitall. Using PDDL domains as benchmarks for evaluating planning performance is a standard practice, as exemplified by the International Planning Competitions (IPCs) [Vallati et al., 2015].
Dataset Splits Yes For each domain, we generated three sets of problem instances (train, validation, test) with parameterized generators used in the IPCs. We provided between 456 and 1536 instances for training (the variation is due to the difference in the number of generator parameters in each domain), between 132 and 384 instances for validation and testing (as separate sets), and 100 instances sampled from the test set for planning.
Hardware Specification Yes On a single NVIDIA Tesla V100, each NLM training took 0.5 hrs except in visitall ( 2 hrs).
Software Dependencies No The paper mentions 'Pyperplan' as the basis for the planning component but does not provide a specific version number for it or any other software dependencies.
Experiment Setup Yes We performed 4 x 10^4 weight updates (training steps) using Adam W [Loshchilov and Hutter, 2017] with batch size 256, weight decay 10^-2 to avoid overfitting, gradient clip 0.1, learning rate of 10^-2 for the linear regression and NLM, and 10^-3 for HGN.