Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Structured World Belief for Reinforcement Learning in POMDP
Authors: Gautam Singh, Skand Peri, Junghyun Kim, Hyunseok Kim, Sungjin Ahn
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we show that object-centric belief provides a more accurate and robust performance for filtering and generation. Furthermore, we show the efficacy of structured world belief in improving the performance of reinforcement learning, planning and supervised reasoning. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Rutgers University 2Electronics and Telecommunications Research Institute 3Rutgers Center for Cognitive Science. |
| Pseudocode | Yes | Algorithm 1 File-Slot Matching and Glimpse Proposal |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of its methodology. |
| Open Datasets | Yes | 2D Maze Game. Matt Chan TK. gym-maze. https://github.com/ Matt Chan TK/gym-maze, 2020. |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly describe validation dataset splits. For example, it states 'we trained our model on 2D Branching Sprites with upto 2 objects but we test in a setting in which up to 4 objects can spawn', indicating a train/test split, but no specific validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper mentions various models and algorithms like AESMC, A2C, and SPACE, but does not provide specific version numbers for any software libraries or dependencies used in the implementation. |
| Experiment Setup | Yes | We evaluate our model SWB with K = 10 particles which provides both object-centric representation and belief states. [...] The SWB world models for 2D Branching Sprites, 3D Food Chase game and 2D Maze Game were pre-trained using frames collected through 200K, 200K and 500K interactions with the environment, respectively, using a random policy. |