reproducibilityindex.ai

Predictable MDP Abstraction for Unsupervised Model-Based RL

Authors: Seohong Park, Sergey Levine

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised modelbased RL approaches in a range of benchmark environments.
Researcher Affiliation	Academia	1University of California, Berkeley. Correspondence to: Seohong Park <seohong@berkeley.edu>.
Pseudocode	Yes	We describe the full training procedure of PMA in Appendix F and Algorithm 1.
Open Source Code	Yes	Our code and videos are available at https://seohong.me/projects/pma/
Open Datasets	Yes	We test PMA and the four previous methods on seven Mu Jo Co robotics environments (Todorov et al., 2012; Brockman et al., 2016) with 13 diverse tasks.
Dataset Splits	No	The paper specifies environment configurations and episode lengths but does not provide explicit dataset split percentages, sample counts, or methods for splitting data into training, validation, and test sets.
Hardware Specification	Yes	We run our experiments on an internal cluster consisting of A5000 or similar GPUs.
Software Dependencies	No	The paper mentions implementation on top of the 'Li SP (Lu et al., 2021) codebase' and uses 'Adam (Kingma & Ba, 2015)' and 'SAC (Haarnoja et al., 2018b)', but it does not provide specific version numbers for general software dependencies like Python, PyTorch, or other libraries.
Experiment Setup	Yes	We present the hyperparameters used in our experiments in Tables 1 to 3. For example, Table 1 lists: # epochs 10000, # environment steps per epoch 4000, Minibatch size 256, Discount factor γ 0.995, Learning rate 3e-4, etc.