Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FACTS: A Factored State-Space Framework for World Modelling

Authors: Li Nanbo, Firas Laakom, Yucheng XU, Wenyi Wang, Jürgen Schmidhuber

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically evaluate FACTS across diverse tasks, including multivariate time series forecasting, object-centric world modelling, and spatial-temporal graph prediction, demonstrating that it consistently outperforms or matches specialised state-of-the-art models, despite its general-purpose world modelling design. ... We conduct an extensive empirical analysis across multiple tasks, such as multivariate time series forecasting and object-centric world modelling, demonstrating that FACTS consistently matches or exceeds the performance of specialised state-of-the-art models.
Researcher Affiliation	Academia	Li Nanbo1 , Firas Laakom1, Yucheng Xu2, Wenyi Wang1, J urgen Schmidhuber1,3 1Center of Excellence for Generative AI, KAUST, Saudi Arabia 2School of Informatics, University of Edinburgh, United Kingdom 3The Swiss AI Lab, IDSIA, USI & SUPSI, Switzerland
Pseudocode	Yes	Algorithm 1 FACTS Module: a Pseudo Implementation 1: Input: X1 t Rt m d t-sequential axis, m-spatial axis 2: Output: Z1 t Rt k d
Open Source Code	Yes	Code available at: https://github.com/Nanbo Li/FACTS.
Open Datasets	Yes	We use the open-source Time Series Library (TSLib)1 to evaluate long-term multivariate time-series forecasting (MSTF) tasks across 9 real-world datasets spanning multiple domains (e.g., energy, weather, and finance). ... synthetic multi-object videos (Yi et al., 2020; Greff et al., 2022; Lin et al., 2020), and dynamic-graph node prediction (Li et al., 2018).
Dataset Splits	Yes	We use the open-source Time Series Library (TSLib)... following TSLib s standardised settings: the input sequence length is fixed at 96, with prediction lengths of {96, 192, 336, 720}. ... During testing, to ensure a fair comparison with Slot Former, we burn-in the first 6 frames and roll out (predict) 48 frames.
Hardware Specification	Yes	All results reported for FACTS in this paper were generated using a single NVIDIA A100 GPU (80 GB).
Software Dependencies	No	The paper mentions "Py Torch" but does not specify a version number, nor does it list other software dependencies with specific versions. For example: "We implement this using Py Torch s standard Conv2d module".
Experiment Setup	Yes	following TSLib s standardised settings: the input sequence length is fixed at 96, with prediction lengths of {96, 192, 336, 720}. Performance is evaluated using mean-squared error (MSE) and mean-absolute error (MAE). ... For the slot dynamics prediction task... we burn-in the first 6 frames and roll out (predict) 48 frames. ... The primary modification is replacing SAVi s recurrent slot attention modules with FACTS. Importantly, all of the used modules (CNN vision encoders, FACTS, and decoders) end-to-end in a single run without any supervision.