Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Graph Neural Network Based Action Ranking for Planning

Authors: Rajesh Mangannavar, Stefan Lee, Alan Fern, Prasad Tadepalli

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results across standard planning benchmarks demonstrate that our action-ranking approach not only achieves better generalization to larger problems than those used in training but also outperforms multiple baselines (value function and action ranking) methods in terms of success rate and plan quality.
Researcher Affiliation Academia Rajesh Mangannavar Oregon State University Corvallis, OR 97330, USA EMAIL
Pseudocode Yes Algorithm 1: Graph Attention-Based Action Ranking (GABAR) Procedure DECODER(g L, {v L i }i V, A, O, k) Algorithm 2: Beam Search Action Decoder
Open Source Code Yes Code : https://github.com/Learning-for-Seq-Decision-Making/GABAR-Graph-based-action-ranking-for-planning.
Open Datasets Yes All problems were generated using openly available PDDL-generators [21]. Dataset submitted as supplementary material
Dataset Splits Yes We divide the test set for each domain into 3 separate subsets, easy, medium, and hard with increasing difficulty with problem sizes as defined in table 1 along with the training and validation dataset sizes. Each test subset has 100 problems. In contrast, the training dataset consists of problems smaller and simpler than the ones found in the easy subset.
Hardware Specification Yes It takes between 1-2 hours to train a model for each domain on an RTX 3080.
Software Dependencies No The paper mentions 'Adam optimizer' and 'GNN' but does not specify version numbers for these or any other software libraries or programming languages used.
Experiment Setup Yes For all domains, we train the model using the Adam optimizer with a learning rate of 0.0005, 9 rounds of GNN, and batch size of 16, hidden dimensionality of 64. Training proceeds for a maximum of 500 epochs, and we select the model checkpoint that achieves the lowest loss on the validation set for evaluation.