Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Curious Causality-Seeking Agents in Open-ended Worlds

Authors: Zhiyu Zhao, Haoxuan Li, Haifeng Zhang, Jun Wang, Francesco Faccio, Jürgen Schmidhuber, Mengyue Yang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on both synthetic tasks and a challenging robot arm manipulation task demonstrate that our method robustly captures shifts in causal dynamics and generalizes effectively to previously unseen contexts. Empirically, our approach demonstrates substantial improvements over purely observational baselines in simulated benchmarks, showcasing its capability to accurately recover complex causal structures. 6 Experiments
Researcher Affiliation	Academia	1University of Bristol 2Peking University 3Institute of Automation, Chinese Academy of Sciences 4School of Artificial Intelligence, Chinese Academy of Sciences 5 University College London 6King Abdullah University of Science and Technology 7The Swiss AI Lab, IDSIA-USI/SUPSI Corresponding author: EMAIL
Pseudocode	Yes	Algorithm 1 Curious Causality-Seeking Meta Causal World Modeling
Open Source Code	Yes	The code is available at https://github.com/zhiyu-zhao-ucas/Meta-Causal-Graph.git, and demonstrations can be found at https://sites.google.com/view/meta-causal-world.
Open Datasets	Yes	We use the Chemical environment to evaluate the performance of the proposed method on learning the causal graphs in a system with multiple causal structures. There are several causal graphs (full, fork, chain) in the Chemical environment and the causal graph depends on the state of the objects. We use two settings of the Chemical environment: (1) full-fork (Fork) and (2) full-chain (Chain). Magnetic [31] The Magnetic environment is a robot arm manipulation task built on the Robosuite framework (Figure 4).
Dataset Splits	No	Training step 1.5 105 1.5 105 2 105 Max episode length 25 25 25 During the test, some nodes are corrupted with noise.
Hardware Specification	Yes	Most experiments were conducted on a server equipped with an AMD EPYC 7V13 64-Core Processor (24 physical cores), supporting 32-bit and 64-bit modes, with 96 Mi B L3 cache and 12 Mi B L2 cache. The machine was equipped with an NVIDIA A100 PCIe GPU with 80GB memory (driver version 575.51.03, CUDA version 12.9).
Software Dependencies	Yes	The machine was equipped with an NVIDIA A100 PCIe GPU with 80GB memory (driver version 575.51.03, CUDA version 12.9).
Experiment Setup	Yes	The detailed environment configurations are shown in Table 7. Paramters Chemical Magnetic Fork Chain Training step 1.5 105 1.5 105 2 105 Optimizer Adam Adam Adam Learning rate 1e-4 1e-4 1e-4 Batch size 256 256 256 Initial step 1000 1000 1500 Max episode length 25 25 25 Action type Discrete Discrete Continous. The detailed hyper-parameters are shown in the Table 6. CEM parameters Chemical Magnetic full-fork full-chain Planning length 3 3 1 Number of candidates 64 64 64 Number of top candidates 32 32 32 Number of iterations 5 5 5 Exploration noise N/A N/A 1e-4 Exploration probability 0.05 0.05 N/A Action type Discrete Discrete Continous