Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
M^3PC: Test-time Model Predictive Control using Pretrained Masked Trajectory Model
Authors: Kehan Wen, Yutong Hu, Yao Mu, Lei Ke
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results on D4RL and Robo Mimic show that our inference-phase MPC significantly improves the decision-making performance of a pretrained trajectory model without any additional parameter training. |
| Researcher Affiliation | Academia | 1ETH Zurich, 2KU Leuven, 3Hong Kong University, 4Carnegie Mellon University |
| Pseudocode | Yes | Algorithm 1 Forward M3PC for Reward Maximization |
| Open Source Code | Yes | Code is available: https://github.com/wkh923/m3pc. |
| Open Datasets | Yes | To answer these questions, we utilize D4RL and Robo Mimic dataset suites. |
| Dataset Splits | No | No explicit train/test/validation dataset splits with percentages or counts are provided for the D4RL and Robo Mimic datasets in the main text. The paper refers to using D4RL and Robo Mimic dataset suites, which are established benchmarks, but does not detail how these were split for their specific experiments (e.g., "80/10/10 split"). |
| Hardware Specification | Yes | The entire training process, including both pretraining and finetuning, is performed on NVIDIA 3090 GPUs. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Table 4 in Section B 'HYPERPARAMETERS' explicitly lists numerous hyperparameters for both offline and online training, including batch size, learning rate, weight decay, target entropy, scheduler type, warmup steps, training steps, and architecture-specific parameters like number of encoder/decoder layers, heads, and embedding dimension. |