Learning to Rank for Synthesizing Planning Heuristics

Authors: Caelan Reed Garrett, Leslie Pack Kaelbling, Tomás Lozano-Pérez

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on recent International Planning Competition problems show that the Rank SVM learned heuristics outperform both the original heuristics and heuristics learned through ordinary regression.
Researcher Affiliation Academia Caelan Reed Garrett, Leslie Pack Kaelbling, Tom as Lozano-P erez MIT CSAIL Cambridge, MA 02139 USA {caelan, lpk, tlp}@csail.mit.edu
Pseudocode No The information is insufficient. The paper describes its methods in narrative text and mathematical formulations but does not include structured pseudocode or algorithm blocks.
Open Source Code No The information is insufficient. The paper mentions using existing open-source frameworks and libraries (Fast Downward, dlib) for implementation but does not state that the authors' own code for the described methodology is publicly available or provide a link to it.
Open Datasets Yes We experimented on four domains from the 2014 IPC learning track [Vallati et al., 2015]: elevators, transport, parking, and no-mystery.
Dataset Splits Yes We select an appropriate value of λ by performing domainwise leave-one-out cross validation (LOOCV): For different possible values of λ, and in a domain with n training problem, we train on data from n 1 training problems and evaluate the resulting heuristic on the remaining problem according to the RMSE loss function, and average the scores from holding out each problem instance.
Hardware Specification Yes Each planner was run on a single 2.5 GHz processor for 30 minutes with 5 GB of memory.
Software Dependencies No The information is insufficient. The paper mentions using 'the Fast Downward framework [Helmert, 2006]' and 'the dlib C++ machine learning library [King, 2009]' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We select an appropriate value of λ by performing domainwise leave-one-out cross validation (LOOCV): For different possible values of λ, and in a domain with n training problem, we train on data from n 1 training problems and evaluate the resulting heuristic on the remaining problem according to the RMSE loss function, and average the scores from holding out each problem instance.