Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

LaM-SLidE: Latent Space Modeling of Spatial Dynamical Systems via Linked Entities

Authors: Florian Sestak, Artur Toshev, Andreas Fürst, Günter Klambauer, Andreas Mayr, Johannes Brandstetter

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In order to evaluate LAM-SLIDE we focus on three key aspects: (i) Robust generalization in diverse domains. We examine LAM-SLIDE s generalization in different data domains in relation to other methods, for which we utilize tracking data from human motion behavior, trajectories of particle systems, and data from molecular dynamics (MD) simulations. (ii) Temporal adaptability. We evaluate temporal adaptability through various conditioning/prediction horizons, considering single/multi-frame conditioning and short/long-term predictions; (iii) Computational efficiency and scalability. Finally, we assess the inference time of LAM-SLIDE , and evaluate performance with respect to model size. The subsequent sections show our key findings. For implementation details and additional results see App. E and G, for information on the datasets see App. E.1.
Researcher Affiliation	Collaboration	1 ELLIS Unit, LIT AI Lab, Institute for Machine Learning, JKU Linz, Austria 2 Department of Engineering Physics and Computation, TUM, Germany 3 Emmi AI Gmb H, Linz, Austria, 4 Clinical Research Institute for Medical AI, JKU Linz, Austria EMAIL
Pseudocode	Yes	B.5 Python Pseudocode This section shows Python pseudocode for training and inference for the latent approximator model.
Open Source Code	Yes	Code is available at https://github.com/ml-jku/La M-SLid E.
Open Datasets	Yes	E.1 Datasets Pedestrian Movement. The pedestrian movement dataset, along with its data processing, is available at https://github.com/Media Brain-SJTU/Eq Motion. Basketball Player Movement. The dataset, along with its predefined splits, is available at https://github.com/xupei0610/Social VAE. N-Body. The dataset creation scripts, along with their predefined splits, are available at https: //github.com/hanjq17/Geo TDM. Small Molecules (MD17). The MD17 dataset is available at http://www.sgdml.org/#datasets. Preprocessing and dataset splits follow Han et al. [35] and can be accessed through their Git Hub repository at https://github.com/hanjq17/Geo TDM. Tetrapeptides. The dataset, including the full simulation parameters for ground truth simulations, is sourced from Jing et al. [44] and is publicly available in their Git Hub repository at https: //github.com/bjing2016/mdgen.
Dataset Splits	Yes	For all three scenarios, we consider 10 conditioning frames and 20 prediction frames. In line with Han et al. [35], we use 3000 trajectories for training and 2000 trajectories for validation and testing... The dataset comprises 5,000 training, 1000 validation, and 1000 test trajectories for each molecule... The dataset comprises 3,109 training, 100 validation, and 100 test peptides.
Hardware Specification	Yes	Our experiments were conducted using a system with 128 CPU cores and 2048GB of system memory. Model training was performed on 4 NVIDIA H200 GPUs, each equipped with 140GB of VRAM.
Software Dependencies	Yes	We used Py Torch 2 [7] for the implementation of our models. Our training pipeline was structured with Py Torch Lightning [28]. We used Hydra [96] to run our experiments with different hyperparameter settings. Our experiments were tracked with Weights & Biases [12].
Experiment Setup	Yes	Tabs. 8 to 12 show the hyperparameters for the individual tasks, loss functions are as defined in App. E.4. For all trained models, we use the Adam W [50, 58] optimizer and use EMA [31] in each update step with a decay parameter of β = 0.999.