Incremental Learning of Planning Actions in Model-Based Reinforcement Learning

Authors: Jun Hao Alvin Ng, Ronald P. A. Petrick

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our work with experimental results for three planning domains.
Researcher Affiliation Academia 1 Department of Computer Science, Heriot-Watt University 2 School of Informatics, University of Edinburgh alvin.ng@ed.ac.uk, R.Petrick@hw.ac.uk
Pseudocode Yes Algorithm 1: Incremental Learning Model
Open Source Code No The paper does not provide any explicit statements or links to open-source code for the methodology.
Open Datasets Yes We used three planning domains: the Tireworld and Exploding Blocksworld domains from the International Probabilistic Planning Competition [Younes et al., 2005], and the Logistics domain.
Dataset Splits No The paper does not specify details about a validation dataset or explicit train/validation/test splits.
Hardware Specification Yes The machine used to run the experiments was a four core Intel(R) i5-6500 with 4 GB of RAM.
Software Dependencies No The paper mentions software like PPDDL and Gourmand but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes In one trial of experiments, ten planning problems are attempted sequentially in an order of increasing scale (see Table 1) following the idea behind curriculum learning [Bengio et al., 2009]. We denote each attempt as an episode. Each trial starts with no prior knowledge; the prior action models for episode 1 are empty action models. Since the planning problems are probabilistic, 50 independent trials are conducted.