reproducibilityindex.ai

Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels

Authors: Sai Rajeswar, Pietro Mazzaglia, Tim Verbelen, Alexandre Piché, Bart Dhoedt, Aaron Courville, Alexandre Lacoste

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The approach is empirically evaluated through a large-scale empirical study, which we use to validate our design choices and analyze our models.An extensive empirical evaluation, supported by more than 2k experiments, among main results, analysis and ablations, was used to carefully study URLB and analyse our method.
Researcher Affiliation	Collaboration	*Equal contribution 1Mila, Universit e de Montr eal 2Service Now Research 3Ghent University imec, Belgium 4CIFAR Fellow.
Pseudocode	Yes	Algorithm 1 Dyna-MPC Algorithm 2 Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels
Open Source Code	Yes	Project website: https://masteringurlb.github.io/ Details on the implementation are provided in Appendix B and the code is available on the project website.
Open Datasets	Yes	Recently, the Unsupervised RL Benchmark (URLB) (Laskin et al., 2021) established a common protocol to compare self-supervised algorithms across several domains and tasks from the DMC Suite (Tassa et al., 2018).
Dataset Splits	No	The paper describes the pre-training (PT) phase for up to "2M frames" and a fine-tuning (FT) phase for "100k frames" as interaction budgets with the environment. However, it does not provide explicit training, validation, or test dataset splits with percentages or sample counts for static datasets, which are typically found in supervised learning contexts.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper refers to algorithms and optimizers (e.g., "Dreamer V2", "Adam") and provides their hyperparameters, but it does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x) that would be needed for replication.
Experiment Setup	Yes	The hyperparameters for the agent, which we keep fixed across all domains and tasks, can be found in Appendix I. Table 5. World model, actor-critic, planner (Dyna-MPC) and common hyperparameters.