Incremental Learning of Planning Actions in Model-Based Reinforcement Learning
Authors: Jun Hao Alvin Ng, Ronald P. A. Petrick
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our work with experimental results for three planning domains. |
| Researcher Affiliation | Academia | 1 Department of Computer Science, Heriot-Watt University 2 School of Informatics, University of Edinburgh alvin.ng@ed.ac.uk, R.Petrick@hw.ac.uk |
| Pseudocode | Yes | Algorithm 1: Incremental Learning Model |
| Open Source Code | No | The paper does not provide any explicit statements or links to open-source code for the methodology. |
| Open Datasets | Yes | We used three planning domains: the Tireworld and Exploding Blocksworld domains from the International Probabilistic Planning Competition [Younes et al., 2005], and the Logistics domain. |
| Dataset Splits | No | The paper does not specify details about a validation dataset or explicit train/validation/test splits. |
| Hardware Specification | Yes | The machine used to run the experiments was a four core Intel(R) i5-6500 with 4 GB of RAM. |
| Software Dependencies | No | The paper mentions software like PPDDL and Gourmand but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | In one trial of experiments, ten planning problems are attempted sequentially in an order of increasing scale (see Table 1) following the idea behind curriculum learning [Bengio et al., 2009]. We denote each attempt as an episode. Each trial starts with no prior knowledge; the prior action models for episode 1 are empty action models. Since the planning problems are probabilistic, 50 independent trials are conducted. |