Set-membership Belief State-based Reinforcement Learning for POMDPs
Authors: Wei Wei, Lijun Zhang, Lin Li, Huizhong Song, Jiye Liang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate our method for several challenging control tasks in this section. Our experiments aim to answer the following questions: First, can SBRL algorithm achieve good results in both partially observable and uncertain environments? Second, can SBM maintain accurate belief states to provide a reasonable basis for agents decision-making under uncertain and partially observable environments? |
| Researcher Affiliation | Academia | 1School of Computer and Information Technology, Shanxi University, Taiyuan 030006. PR. China. Correspondence to: Jiye Liang <liy@sxu.edu.cn>. |
| Pseudocode | Yes | Algorithm 1 SBRL algorithm |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | Mountain Hike is a continuous control environment with observation uncertainty where an agent navigates on a fixed 20 × 20 map, introduced by (Igl et al., 2018) to demonstrate the benefit of belief tracking for POMDP RL. |
| Dataset Splits | No | The paper uses standard benchmark environments but does not explicitly provide specific training/validation/test dataset splits with percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies used in the experiments. |
| Experiment Setup | No | We train SBRL and baselines with similar network architecture and hyperparameters as the original DPFRL implementation. |