Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning

Authors: Giseung Park, Sungho Choi, Youngchul Sung7941-7948

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide some numerical results to evaluate the proposed block model learning scheme for POMDPs. Numerical results show that the proposed method significantly outperforms previous methods in various partially observable environments.
Researcher Affiliation Academia Giseung Park, Sungho Choi, Youngchul Sung School of Electrical Engineering, KAIST, Korea {gs.park, sungho.choi, ycsung}@kaist.ac.kr
Pseudocode Yes The pseudocode of the algorithm and the details are described in Appendix C and D, respectively.
Open Source Code Yes Our source code is provided at https://github.com/Giseung-Park/Block Seq.
Open Datasets Yes Mountain Hike (Igl et al. 2018) Some part of each state is missing: Pendulum random missing version (Meng, Gorbet, and Kulic 2021) Memorizing long history is required: Sequential target-reaching task (Han, Doya, and Tani 2020a) Navigating agent cannot observe the whole map in maze: Minigrid (https://github.com/maximecb/gym-minigrid)
Dataset Splits No No explicit training/validation/test dataset splits (e.g., percentages or counts) were provided. The paper discusses training and evaluation in terms of episodes and timesteps in RL environments, where data is generated through interaction.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for experiments were mentioned.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) were mentioned.
Experiment Setup Yes (The details of the implementations are described in Appendix D.)