Simplifying Latent Dynamics with Softly State-Invariant World Models

Authors: Tankred Saanum, Peter Dayan, Eric Schulz

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the PLSM s effect on planning algorithms ability to learn policies in continuous control tasks.
Researcher Affiliation Academia Tankred Saanum1 Peter Dayan1,2 Eric Schulz1,3 1Max Planck Institute for Biological Cybernetics, 2University of Tübingen 3 Helmholtz Institute for Human-Centered AI, Helmholtz Center Munich, Neuherberg, Germany tankred.saanum@tuebingen.mpg.de
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No We will release code with model implementation upon publication.
Open Datasets Yes We evaluated the efficacy of parsimonious dynamics for control in five state-based continuous control tasks from the Deep Mind Control Suite (DMC) [11]. ... We trained the PLSM augmented SPR algorithm on 100k environment steps across 5 seeds on all 26 Atari games. ... Additionally we created an environment based on the d Sprite dataset [18]... Lastly, we evaluate PLSM on a dynamic object interaction dataset with realistic textures and physics without actions, MOVi-E [19]
Dataset Splits No No explicit statement detailing specific training/validation/test splits (e.g., percentages, sample counts, or explicit predefined splits) was found.
Hardware Specification Yes We ran all experiments reported in the paper on compute nodes with 2 Nvidia A100 GPUs.
Software Dependencies No The paper mentions software components like PyTorch (implicitly via `torch` import), ReLU activation, and Adam optimizer, but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes Table 2: Contrastive model hyperparameters. Hidden units 512 Batch size 512 MLP hidden layers 2 Latent dimensions |zt| 50 Query dimensions |ht| 50 Regularization coefficient β 0.1 Margin λ 1 Learning rate 0.001 Activation function Re LU [50] Optimizer Adam [51]