Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

M^3PC: Test-time Model Predictive Control using Pretrained Masked Trajectory Model

Authors: Kehan Wen, Yutong Hu, Yao Mu, Lei Ke

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results on D4RL and Robo Mimic show that our inference-phase MPC significantly improves the decision-making performance of a pretrained trajectory model without any additional parameter training.
Researcher Affiliation	Academia	1ETH Zurich, 2KU Leuven, 3Hong Kong University, 4Carnegie Mellon University
Pseudocode	Yes	Algorithm 1 Forward M3PC for Reward Maximization
Open Source Code	Yes	Code is available: https://github.com/wkh923/m3pc.
Open Datasets	Yes	To answer these questions, we utilize D4RL and Robo Mimic dataset suites.
Dataset Splits	No	No explicit train/test/validation dataset splits with percentages or counts are provided for the D4RL and Robo Mimic datasets in the main text. The paper refers to using D4RL and Robo Mimic dataset suites, which are established benchmarks, but does not detail how these were split for their specific experiments (e.g., "80/10/10 split").
Hardware Specification	Yes	The entire training process, including both pretraining and finetuning, is performed on NVIDIA 3090 GPUs.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup	Yes	Table 4 in Section B 'HYPERPARAMETERS' explicitly lists numerous hyperparameters for both offline and online training, including batch size, learning rate, weight decay, target entropy, scheduler type, warmup steps, training steps, and architecture-specific parameters like number of encoder/decoder layers, heads, and embedding dimension.