reproducibilityindex.ai

Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP

Authors: Jiacheng Guo, Zihao Li, Huazheng Wang, Mengdi Wang, Zhuoran Yang, Xuezhou Zhang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the performance of PORL2 using the partially observed combination lock (pocomblock) as our benchmark, which is inspired by the combination lock benchmark introduced by Misra et al. (2019). ... In our experiment, we compare our method with BRIEE, the latest representation learning algorithm for MDP. ... Figure 2 is the moving average of evaluation returns of pocomblock for PORL2 and BRIEE
Researcher Affiliation	Academia	1Department of Electrical and Computer Engineering, Princeton University, Princeton, NJ, USA 2School of Mathematical Sciences, Fudan University, Shanghai, China 3School of Electrical Engineering and Computer Science, Oregon State University, OR, USA 4Department of Statistics and Data Science, Yale University, NH, USA.
Pseudocode	Yes	Algorithm 1 Partially Observable Representation Learning for L-decodable POMDPs (PORL2-decodable)
Open Source Code	Yes	Reproducibility. Our model and code can be found at https://github.com/icmlpomdpexpe/POMDPreplearn.
Open Datasets	Yes	We evaluate the performance of PORL2 using the partially observed combination lock (pocomblock) as our benchmark, which is inspired by the combination lock benchmark introduced by Misra et al. (2019).
Dataset Splits	No	No explicit statement of training, validation, or test dataset splits (e.g., percentages or counts) was found. The paper describes the custom 'pocomblock' environment used for evaluation and lists hyperparameters in tables, but not dataset splits.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments were provided in the paper.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) were explicitly stated. The paper mentions using a 'two-layer neural network' and 'SGD' optimizer.
Experiment Setup	Yes	We record the hyperparameters we try and the final hyperparameter we use for PORL2 in Table 2 and BRIEE in Table 3. These tables provide specific values for Batch size, Discriminator f number of gradient steps, Horizon, The number of iterations of representation learning, LSVI-LLR bonus coefficient β, LSVI-LLR regularization coefficient λ, Optimizer, Decoder ϕ learning rate, Discriminator f learning rate.