reproducibilityindex.ai

Simplifying Latent Dynamics with Softly State-Invariant World Models

Authors: Tankred Saanum, Peter Dayan, Eric Schulz

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the PLSM s effect on planning algorithms ability to learn policies in continuous control tasks.
Researcher Affiliation	Academia	Tankred Saanum1 Peter Dayan1,2 Eric Schulz1,3 1Max Planck Institute for Biological Cybernetics, 2University of Tübingen 3 Helmholtz Institute for Human-Centered AI, Helmholtz Center Munich, Neuherberg, Germany tankred.saanum@tuebingen.mpg.de
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	We will release code with model implementation upon publication.
Open Datasets	Yes	We evaluated the efficacy of parsimonious dynamics for control in five state-based continuous control tasks from the Deep Mind Control Suite (DMC) [11]. ... We trained the PLSM augmented SPR algorithm on 100k environment steps across 5 seeds on all 26 Atari games. ... Additionally we created an environment based on the d Sprite dataset [18]... Lastly, we evaluate PLSM on a dynamic object interaction dataset with realistic textures and physics without actions, MOVi-E [19]
Dataset Splits	No	No explicit statement detailing specific training/validation/test splits (e.g., percentages, sample counts, or explicit predefined splits) was found.
Hardware Specification	Yes	We ran all experiments reported in the paper on compute nodes with 2 Nvidia A100 GPUs.
Software Dependencies	No	The paper mentions software components like PyTorch (implicitly via `torch` import), ReLU activation, and Adam optimizer, but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	Table 2: Contrastive model hyperparameters. Hidden units 512 Batch size 512 MLP hidden layers 2 Latent dimensions \|zt\| 50 Query dimensions \|ht\| 50 Regularization coefficient β 0.1 Margin λ 1 Learning rate 0.001 Activation function Re LU [50] Optimizer Adam [51]