Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Understanding Sample Generation Strategies for Learning Heuristic Functions in Classical Planning

Authors: Rafael V. Bettker, Pedro P. Minini, André G. Pereira, Marcus Ritt

JAIR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through controlled experiments on planning tasks with small state spaces, we identify several techniques that improve the quality of the samples used for training. The contributions include: A systematic study on sampling quality (Section 4).
Researcher Affiliation	Academia	Rafael V. Bettker EMAIL Pedro P. Minini EMAIL Andre G. Pereira EMAIL Marcus Ritt EMAIL Universidade Federal do Rio Grande do Sul, Brazil
Pseudocode	No	The paper describes various algorithms and strategies, such as the FSM sampling strategy and improvement procedures (SAI, SUI), but it does so through descriptive text and flowcharts (like Figure 1: Training set generation workflow), rather than explicitly formatted pseudocode or algorithm blocks with numbered steps.
Open Source Code	Yes	The source code, planning tasks, and experiments are available3. 3. Available at https://github.com/yaaig-ufrgs/Neural Fast Downward-FSM.
Open Datasets	Yes	We use the benchmark defined by Ferber et al. (2020) and Ferber et al. (2022). Ferber et al. (2020) selects for each domain, IPC planning task with their original initial states, that are solved within 1 and 900 seconds by GBFS with h FF. Each domain has the following number of selected tasks: Blocks World, 5; Depot, 6; Grid, 2; N-Puzzle, 8; Pipesworld-No Tankage, 10; Rovers, 8; Scanalyzer, 6; Storage, 4; Transport, 8; Visit All, 6.
Dataset Splits	Yes	We use 90 % of the sampled data as the training set, with the remaining 10 % as the validation set.
Hardware Specification	Yes	All experiments were run on a PC with an AMD Ryzen 9 3900X 12-core processor, running at 4.2 GHz with 32 GB of main memory, using a single core per process distributed among 12 (for small planning tasks) and 10 cores (for large planning tasks).
Software Dependencies	Yes	All methods are implemented on the Neural Fast Downward2 planning system with Py Torch 1.9.0 (Ferber et al. 2020; Paszke et al. 2019).
Experiment Setup	Yes	The network has two hidden layers followed by a residual block with two hidden layers. Each hidden layer has 250 neurons that use Re LU activation and are initialized as proposed by He et al. (2015). The output of the NN uses Re LU activation during training and evaluation. The training uses the Adam optimizer (Kingma and Ba 2015), a learning rate of 10 4, an early-stop patience of 100, and a mean squared error loss function. Due to better results in preliminary experiments, we use batch sizes of 64 for small and 512 for large state spaces.