reproducibilityindex.ai

Making Linear MDPs Practical via Contrastive Representation Learning

Authors: Tianjun Zhang, Tongzheng Ren, Mengjiao Yang, Joseph Gonzalez, Dale Schuurmans, Bo Dai

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
Researcher Affiliation	Collaboration	1UC Berkeley 2UT Austin 3Google Brain 4University of Alberta.
Pseudocode	Yes	Algorithm 1 CTRL-UCB: Online Exploration with Representation Learning
Open Source Code	No	The paper does not provide an explicit statement of code release or a link to a repository for the methodology described.
Open Datasets	Yes	We test our algorithm extensively on the dense-reward Mu Jo Co benchmark from MBBL. ... We conduct experiments on the Deep Mind Control Suite. ... Lastly, we instantiate our CTRL-LCB algorithm in the offline setting on the D4RL benchmark (Fu et al., 2020).
Dataset Splits	No	The paper discusses data collection and benchmarks but does not explicitly provide training/validation/test dataset splits with specific percentages or counts.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper refers to existing software frameworks and libraries (e.g., 'actor-critic algorithm with entropy regularizer'), but it does not list specific version numbers for these or other software dependencies.
Experiment Setup	Yes	In this section, we list all the hyperparameter and network architecture we use for our experiments. For online Mu Jo Co and DM Control tasks, the hyperparameters can be found at Table 5. ... Table 5. Hyperparameters used for CTRL-UCB in all the environments in Mu Jo Co and DM Control Suite.