Facing Off World Model Backbones: RNNs, Transformers, and S4

Authors: Fei Deng, Junyeong Park, Sungjin Ahn

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Furthermore, we extensively compare RNN-, Transformer-, and S4-based world models across four sets of environments, which we have tailored to assess crucial memory capabilities of world models, including long-term imagination, context-dependent recall, reward prediction, and memory-based reasoning.
Researcher Affiliation Academia Fei Deng Rutgers University fei.deng@rutgers.edu; Junyeong Park KAIST jyp10987@kaist.ac.kr; Sungjin Ahn KAIST sungjin.ahn@kaist.ac.kr
Pseudocode Yes Algorithm 1 S4WM Training; Algorithm 2 S4WM Imagination
Open Source Code No https://fdeng18.github.io/s4wm is provided as a project page, but the paper does not contain an explicit statement like 'We release our code' nor a direct link to a code repository for the described methodology.
Open Datasets No For each 3D environment (i.e., Two Rooms, Four Rooms, and Ten Rooms), we generate 30K trajectories using a scripted policy... For each 2D environment (i.e., Distracting Memory and Multi Doors Keys), we generate 10K trajectories using a scripted policy... The paper states it generated its own datasets from environments but does not provide access to these collected datasets.
Dataset Splits Yes For each 3D environment (i.e., Two Rooms, Four Rooms, and Ten Rooms), we generate 30K trajectories using a scripted policy, of which 28K are used for training, 1K for validation, and 1K for testing. For each 2D environment (i.e., Distracting Memory and Multi Doors Keys), we generate 10K trajectories using a scripted policy, of which 8K are used for training, 1K for validation, and 1K for testing.
Hardware Specification Yes All results in Figure 3 are obtained on a single NVIDIA RTX A6000 GPU.
Software Dependencies No The paper mentions using Adam W optimizer and Si LU nonlinearity, and states that the implementation is based on S4 [21] and Dreamer V3 [31] code, but it does not specify version numbers for any software components or libraries.
Experiment Setup Yes Hyperparameters and further implementation details can be found in Appendix J. We provide the hyperparameters used for 3D and 2D environments in Tables 10 and 11 respectively. These tables list specific values for optimizer, batch size, learning rate, weight decay, gradient clipping, and various model architectural parameters.