Slot State Space Models

Authors: Jindong Jiang, Fei Deng, Gautam Singh, Minseung Lee, Sungjin Ahn

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, we evaluate our model in object-centric learning, 3D visual reasoning, and long-context video understanding tasks, which involve modeling multiple objects and their long-range temporal dependencies. We find that our proposed design offers substantial performance gains over existing sequence modeling methods.
Researcher Affiliation Academia Jindong Jiang Rutgers University Fei Deng Rutgers University Gautam Singh Rutgers University Minseung Lee KAIST Sungjin Ahn KAIST
Pseudocode Yes We include pseudo-code of the Mamba block implementation in Algorithm 1.
Open Source Code No We will make our work open-source upon acceptance.
Open Datasets Yes We utilize the bouncing balls video dataset introduced by [62], which consists of videos of white balls bouncing off each other in an empty window.
Dataset Splits No The paper describes how data is used for training, pre-training, fine-tuning, and testing in various experiments, including specific counts and sampling strategies for some datasets. However, explicit validation dataset splits (percentages or counts) are not clearly stated or provided for all experiments in a unified manner.
Hardware Specification No The paper does not specify the exact hardware used for experiments, such as specific GPU or CPU models. It only generally refers to 'our academic research lab’s computing resource constraints'.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'CUDA 11.1').
Experiment Setup Yes Table 2 provides detailed hyperparameters including 'Batch Size', 'Training Steps', 'Sequence Length', 'Optimizer', 'Weight Decay', and 'Learning Rate' for both Blinking Color Balls and MOVi-A experiments.