reproducibilityindex.ai

Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning

Authors: Hongming Zhang, Tongzheng Ren, Chenjun Xiao, Dale Schuurmans, Bo Dai

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct a comprehensive empirical comparison to current existing RL algorithms for POMDPs on several benchmarks, demonstrating the superior empirical performance of µLV-Rep (Section 7).
Researcher Affiliation	Collaboration	1 University of Alberta 2 UT Austin 3 The Chinese University of Hong Kong, Shenzhen 4 Google DeepMind 5 Georgia Tech.
Pseudocode	Yes	Algorithm 1 Online Exploration for L-step decodable POMDPs with Latent Variable Representation
Open Source Code	No	The paper does not explicitly state that its source code is made open source or provide a link to a code repository for the described methodology.
Open Datasets	Yes	We evaluate the proposed method on Meta-world (Yu et al., 2019), which is an open-source simulated benchmark consisting of 50 distinct robotic manipulation tasks with visual observations. We also provide experiment results on partial observable control problems constructed based on Open AI gym Mu Jo Co (Todorov et al., 2012) in Appendix H.2.
Dataset Splits	No	The paper mentions evaluating on Meta-world and MuJoCo tasks but does not specify train/validation/test splits, only referring to 'standard split' or general evaluation protocols.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions using SAC (Haarnoja et al., 2018), Dreamer V2 (Hafner et al., 2021), MWM (Seo et al., 2023), and VAE (Kingma & Welling, 2013), but it does not specify version numbers for any of these or other software dependencies.
Experiment Setup	Yes	More implementation details, including network architectures and hyper-parameters, are provided in Appendix H. Table 2. Hyperparameters in µLV-Rep. The numbers in Conv and MLP denote the output channels and units.