Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Action Space Reduction for Planning Domains
Authors: Harsha Kokel, Junkyu Lee, Michael Katz, Kavitha Srinivas, Shirin Sohrabi
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show a significant reduction in the action label space size across a wide collection of planning domains. We demonstrate the benefit of our automated label reduction in two separate use cases: improved sample complexity of model-free reinforcement learning algorithms and speeding up successor generation in lifted planning. |
| Researcher Affiliation | Industry | Harsha Kokel , Junkyu Lee , Michael Katz , Kavitha Srinivas and Shirin Sohrabi IBM T.J. Watson Research Center, Yorktown Heights, USA EMAIL, EMAIL |
| Pseudocode | Yes | Input: A lifted action o with parameters params(o) and a set of relevant lifted mutex groups L. Find: A subset X params(o) of parameters s.t. X1, . . . Xk with (i) X =X1 X2 . . . Xk =params(o), and (ii) Xi+1 =Xi vc(l) for some l L s.t. vf(l) Xi. ... To solve the parameter seed set problem, we cast it as a (delete-free) STRIPS planning task with operation costs. We first find a set L of relevant LMGs. Then, for each lifted action o we define a separate planning task ฮ o = Lo, Oo, Io, Go , where Language Lo contains a single predicate mark and an object for each parameter in params(o). The set Oo consists of two types of actions 1. seedx actions are defined for each parameter x params(o) as seedx := seedx, log(|D(x)|), , {mark(x)}, 2. getl actions are defined for each relevant LMG l as getl := getl, 0, {mark(x) | x vf(l)}, {mark(y) | y vc(l)}, . Initial state Io = Goal state Go = {mark(x) | x params(o)}. |
| Open Source Code | Yes | The code and supplementary material are available at https://github.com/IBM/Parameter-Seed-Set. |
| Open Datasets | Yes | We compare the size of label sets, obtained with and without the proposed reduction, on a representative set of 14 STRIPS domains from various IPC (using the typed versions where available) and 10 hard-to-ground (HTG) domains. ... We generate 500 unique pairs of initial and goal states in each domain. Of these, 250 pairs were used in training and the remaining were set aside for evaluation. |
| Dataset Splits | No | The paper mentions training and evaluation/test sets but does not explicitly describe a separate validation split for reproducing the data partitioning. |
| Hardware Specification | No | The paper does not explicitly describe specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It mentions using "Fast Downward" and "ACME RL library" but no hardware specifications. |
| Software Dependencies | No | The paper mentions software like "Fast Downward [Helmert, 2006]", "implementation by Fiหser [2020]", and "ACME RL library [Hoffman et al., 2020]" but does not provide specific version numbers for these software components, which are necessary for reproducible software dependencies. |
| Experiment Setup | No | The paper describes some aspects of the experimental setup, such as using 500 unique initial and goal state pairs for RL experiments, and using h_FF as a dense reward function with Double DQN from ACME. However, it lacks specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed system-level training configurations needed for full reproducibility. |