reproducibilityindex.ai

TD-MPC2: Scalable, Robust World Models for Continuous Control

Authors: Nicklas Hansen, Hao Su, Xiaolong Wang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate TD-MPC2 across a total of 104 diverse continuous control tasks spanning 4 task domains: DMControl (Tassa et al., 2018), Meta-World (Yu et al., 2019), Mani Skill2 (Gu et al., 2023), and Myo Suite (Caggiano et al., 2022). We summarize our results in Figure 1, and visualize task domains in Figure 2. ... Our results demonstrate that TD-MPC2 consistently outperforms existing model-based and model-free methods, using the same hyperparameters across all tasks (Figure 1, right).
Researcher Affiliation	Academia	University of California San Diego, Equal advising {nihansen,haosu,xiw012}@ucsd.edu
Pseudocode	No	The paper describes the algorithm in prose and provides architectural details, but it does not include a formal pseudocode block or algorithm listing.
Open Source Code	Yes	In support of open-source science, we publicly release 300+ model checkpoints, datasets, and code for training and evaluating TD-MPC2 agents, which is available at https://tdmpc2.com.
Open Datasets	Yes	We evaluate TD-MPC2 across a total of 104 diverse continuous control tasks spanning 4 task domains: DMControl (Tassa et al., 2018), Meta-World (Yu et al., 2019), Mani Skill2 (Gu et al., 2023), and Myo Suite (Caggiano et al., 2022).
Dataset Splits	No	The paper does not explicitly describe train/validation/test dataset splits with specific percentages or sample counts for the environments or collected data. It mentions 'validation' in the context of Q-function ensemble, but not for overall dataset partitioning.
Hardware Specification	Yes	Approximate TD-MPC2 training cost on the 80-task dataset, reported in GPU days on a single NVIDIA GeForce RTX 3090 GPU.
Software Dependencies	No	The paper mentions 'Py Torch-like notation' for the architecture and refers to libraries like 'Layer Norm (Ba et al., 2016)' and 'Mish (Misra, 2019)', but it does not specify exact version numbers for any software dependencies.
Experiment Setup	Yes	We use the same hyperparameters across all tasks. Our hyperparameters are listed in Table 8. (Table 8 details Planning Horizon, Iterations, Batch size, Learning rate, etc.)