Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

DyMoDreamer: World Modeling with Dynamic Modulation

Authors: Boxuan Zhang, Runqing Wang, Wei Xiao, Weipu Zhang, Jian Sun, Gao Huang, Jie Chen, Gang Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that Dy Mo Dreamer sets a new stateof-the-art on the Atari 100k benchmark with a 156.6% mean human-normalized score, establishes a new record of 832 on the Deep Mind Visual Control Suite, and gains a 9.5% performance improvement after 1M steps on the Crafter benchmark.
Researcher Affiliation	Academia	1School of Automation, Beijing Institute of Technology 2Department of Automation, BNRist, Tsinghua University
Pseudocode	No	The paper describes the methodology in Section 2 using prose and mathematical equations, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is released at https://github.com/Ultraman-Tiga1/Dy Mo Dreamer.
Open Datasets	Yes	We evaluate Dy Mo Dreamer on several widely-used benchmarks for sample-effcient RL: Atari 100k, Deep Mind Visual Control Suite and Crafter. [...] Atari 100k [15] [...] Deep Mind Visual Control Suite [64] [...] Crafter [42]
Dataset Splits	Yes	Agents are restricted to 100k actions (equivalent to 400k frames or 1.85 hours of gameplay) for training before evaluation.
Hardware Specification	Yes	In our experiments, we use a machine with an NVIDIA 4090 graphics card with 8 CPU cores and 24 GB RAM.
Software Dependencies	No	The paper mentions 'JAX implementation' but does not provide specific version numbers for JAX or any other software dependencies.
Experiment Setup	Yes	Table 7: Hyperparameters. This table details hyperparameters such as Batch size 16, Batch length 64, Learning rate 10-4, Imagination horizon 15, and many others for both the World Model and Actor Critic.