Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Dyn-O: Building Structured World Models with Object-Centric Representations

Authors: Zizhao Wang, Kaixin Wang, Li Zhao, Peter Stone, Jiang Bian

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Dyn-O in seven Procgen [7] environments. Our experiments indicate that Dyn-O learns high-quality object-centric representations, generalizable world models, and disentangled static and dynamic representations. We summarize our contributions as follows.
Researcher Affiliation	Collaboration	Zizhao Wang1,2 Kaixin Wang2 Li Zhao2 Peter Stone1,3 Jiang Bian2 1The University of Texas at Austin 2Microsoft Research Asia 3Sony AI EMAIL,EMAIL EMAIL
Pseudocode	Yes	Algorithm 1 Dyn-O World Model Learning
Open Source Code	Yes	The code of Dyn-O can be found at: https://github.com/wangzizhao/dyn-O.
Open Datasets	Yes	Our experiments are conducted in Procgen [7], a set of procedurally-generated 2D video game environments. In addition to Procgen, we further compare Dyn-O against Dreamer V3 on the CLEVR dataset [24] and two ALE environments [4], and the results are in Appendix B.3.
Dataset Splits	No	In each Procgen environment, a PPG policy [8] is trained and used to collect an offline dataset of 1M transitions from the first 200 levels for the learning of all methods. To evaluate the generalizability of each method, we use the learned world model to generate 20-step rollouts in 500 unseen levels.
Hardware Specification	No	The paper does not explicitly mention specific hardware models (like GPU types, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions using a "PPG policy [8]" and "Adam" optimizer, but does not provide specific version numbers for any software libraries, programming languages, or environments used for implementation.
Experiment Setup	Yes	Table 4: The Architecture and Hyperparameters of Encoder Learning. Table 5: The Architecture and Hyperparameters of Dynamics Learning.