Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Understanding Sample Generation Strategies for Learning Heuristic Functions in Classical Planning

Authors: Rafael V. Bettker, Pedro P. Minini, AndrΓ© G. Pereira, Marcus Ritt

JAIR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through controlled experiments on planning tasks with small state spaces, we identify several techniques that improve the quality of the samples used for training. The contributions include: A systematic study on sampling quality (Section 4).
Researcher Affiliation Academia Rafael V. Bettker EMAIL Pedro P. Minini EMAIL Andre G. Pereira EMAIL Marcus Ritt EMAIL Universidade Federal do Rio Grande do Sul, Brazil
Pseudocode No The paper describes various algorithms and strategies, such as the FSM sampling strategy and improvement procedures (SAI, SUI), but it does so through descriptive text and flowcharts (like Figure 1: Training set generation workflow), rather than explicitly formatted pseudocode or algorithm blocks with numbered steps.
Open Source Code Yes The source code, planning tasks, and experiments are available3. 3. Available at https://github.com/yaaig-ufrgs/Neural Fast Downward-FSM.
Open Datasets Yes We use the benchmark defined by Ferber et al. (2020) and Ferber et al. (2022). Ferber et al. (2020) selects for each domain, IPC planning task with their original initial states, that are solved within 1 and 900 seconds by GBFS with h FF. Each domain has the following number of selected tasks: Blocks World, 5; Depot, 6; Grid, 2; N-Puzzle, 8; Pipesworld-No Tankage, 10; Rovers, 8; Scanalyzer, 6; Storage, 4; Transport, 8; Visit All, 6.
Dataset Splits Yes We use 90 % of the sampled data as the training set, with the remaining 10 % as the validation set.
Hardware Specification Yes All experiments were run on a PC with an AMD Ryzen 9 3900X 12-core processor, running at 4.2 GHz with 32 GB of main memory, using a single core per process distributed among 12 (for small planning tasks) and 10 cores (for large planning tasks).
Software Dependencies Yes All methods are implemented on the Neural Fast Downward2 planning system with Py Torch 1.9.0 (Ferber et al. 2020; Paszke et al. 2019).
Experiment Setup Yes The network has two hidden layers followed by a residual block with two hidden layers. Each hidden layer has 250 neurons that use Re LU activation and are initialized as proposed by He et al. (2015). The output of the NN uses Re LU activation during training and evaluation. The training uses the Adam optimizer (Kingma and Ba 2015), a learning rate of 10 4, an early-stop patience of 100, and a mean squared error loss function. Due to better results in preliminary experiments, we use batch sizes of 64 for small and 512 for large state spaces.