On Using Admissible Bounds for Learning Forward Search Heuristics
Authors: Carlos Núñez-Molina, Masataro Asai, Pablo Mesejo, Juan Fernandez-Olivares
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments where both MSE and our novel loss function are applied to learning a heuristic from optimal plan costs. Results show that our proposed method converges faster during training and yields better heuristics. and 4 Experimental Evaluation We evaluate the effectiveness of our new loss function under the domain-specific generalization setting, where the learned heuristic function is required to generalize across different problems of a single domain. |
| Researcher Affiliation | Collaboration | Carlos N u nez-Molina1 , Masataro Asai2 , Pablo Mesejo1 and Juan Fern andez-Olivares1 1University of Granada, Spain 2MIT-IBM Watson AI Lab, USA |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | 1Our full code and data can be found in github.com/pddl-heuristic-learning/pddlsl. |
| Open Datasets | Yes | We trained our system on four classical planning domains: blocksworld-4ops, ferry, gripper, and visitall. Using PDDL domains as benchmarks for evaluating planning performance is a standard practice, as exemplified by the International Planning Competitions (IPCs) [Vallati et al., 2015]. |
| Dataset Splits | Yes | For each domain, we generated three sets of problem instances (train, validation, test) with parameterized generators used in the IPCs. We provided between 456 and 1536 instances for training (the variation is due to the difference in the number of generator parameters in each domain), between 132 and 384 instances for validation and testing (as separate sets), and 100 instances sampled from the test set for planning. |
| Hardware Specification | Yes | On a single NVIDIA Tesla V100, each NLM training took 0.5 hrs except in visitall ( 2 hrs). |
| Software Dependencies | No | The paper mentions 'Pyperplan' as the basis for the planning component but does not provide a specific version number for it or any other software dependencies. |
| Experiment Setup | Yes | We performed 4 x 10^4 weight updates (training steps) using Adam W [Loshchilov and Hutter, 2017] with batch size 256, weight decay 10^-2 to avoid overfitting, gradient clip 0.1, learning rate of 10^-2 for the linear regression and NLM, and 10^-3 for HGN. |