Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning
Authors: Giseung Park, Sungho Choi, Youngchul Sung7941-7948
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we provide some numerical results to evaluate the proposed block model learning scheme for POMDPs. Numerical results show that the proposed method significantly outperforms previous methods in various partially observable environments. |
| Researcher Affiliation | Academia | Giseung Park, Sungho Choi, Youngchul Sung School of Electrical Engineering, KAIST, Korea {gs.park, sungho.choi, ycsung}@kaist.ac.kr |
| Pseudocode | Yes | The pseudocode of the algorithm and the details are described in Appendix C and D, respectively. |
| Open Source Code | Yes | Our source code is provided at https://github.com/Giseung-Park/Block Seq. |
| Open Datasets | Yes | Mountain Hike (Igl et al. 2018) Some part of each state is missing: Pendulum random missing version (Meng, Gorbet, and Kulic 2021) Memorizing long history is required: Sequential target-reaching task (Han, Doya, and Tani 2020a) Navigating agent cannot observe the whole map in maze: Minigrid (https://github.com/maximecb/gym-minigrid) |
| Dataset Splits | No | No explicit training/validation/test dataset splits (e.g., percentages or counts) were provided. The paper discusses training and evaluation in terms of episodes and timesteps in RL environments, where data is generated through interaction. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for experiments were mentioned. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) were mentioned. |
| Experiment Setup | Yes | (The details of the implementations are described in Appendix D.) |