reproducibilityindex.ai

MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations

Authors: Nicklas Hansen, Yixin Lin, Hao Su, Xiaolong Wang, Vikash Kumar, Aravind Rajeswaran

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically study three complex visuo-motor control domains and find that our method is 160% 250% more successful in completing sparse reward tasks compared to prior approaches in the low data regime (100k interaction steps, 5 demonstrations).
Researcher Affiliation	Collaboration	1Meta AI 2University of California San Diego {nihansen,haosu,xiw012}@ucsd.edu {vikashplus,aravraj}@meta.com
Pseudocode	Yes	Algorithm 1 Model-Based Reinforcement Learning with Demonstrations (Mo Dem)
Open Source Code	Yes	Code and videos are available at https://nicklashansen.github.io/modemrl. ... We provide extensive implementation details in appendices, and have made our full implementation available at https://github.com/facebookresearch/modem.
Open Datasets	Yes	Experiments are conducted with publicly available environments. ... We evaluate methods extensively across three domains: Adroit (Rajeswaran et al., 2018), Meta-World (Yu et al., 2019), and DMControl (Tassa et al., 2018).
Dataset Splits	No	The paper mentions evaluating methods under a budget of '100k online interactions' and using '5 demonstrations', which refers to the overall interaction budget for learning and evaluation, but it does not specify explicit dataset splits (e.g., percentages or counts for training, validation, and testing sets) in the traditional sense of static datasets. The experiments are conducted in an online reinforcement learning setting where data is collected interactively rather than from a pre-defined static split.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Py Torch-like overview of our architecture' and 'Adam' as an optimizer, but it does not specify exact version numbers for software dependencies like Python, PyTorch, or other libraries.
Experiment Setup	Yes	Table 5. Mo Dem hyperparameters. We list all relevant hyperparameters for our proposed method below. Highlighted rows are unique to Mo Dem, whereas the remainder are inherited from TD-MPC but included for completeness.