reproducibilityindex.ai

Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling

Authors: Raunaq Bhirangi, Chenyu Wang, Venkatesh Pattabiraman, Carmel Majidi, Abhinav Gupta, Tess Hellebrekers, Lerrel Pinto

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Across six real-world sensor datasets, from tactile-based state prediction to accelerometer-based inertial measurement, Hi SS outperforms state-of-the-art sequence models such as causal Transformers, LSTMs, S4, and Mamba by at least 23% on MSE. Our experiments further indicate that Hi SS demonstrates efficient scaling to smaller datasets and is compatible with existing data-filtering techniques.
Researcher Affiliation	Collaboration	1Carnegie Mellon University, Pittsburgh, USA 2FAIR, Meta 3New York University, NYC, USA.
Pseudocode	No	The paper describes models and architectures in text and diagrams (e.g., Figure 4) but does not include any explicit pseudocode blocks or algorithm listings.
Open Source Code	Yes	Code, datasets and videos can be found on https://hiss-csp.github.io
Open Datasets	Yes	We release CSP-Bench, the largest publicly accessible benchmark for continuous sequence-to-sequence prediction for multiple sensor datasets. ... Code, datasets and videos can be found on https://hiss-csp.github.io
Dataset Splits	Yes	For all tactile datasets and VECtor, we use an 80-20 train-validation split. For the Ro NIN dataset, we use the first four minutes of every trajectory for our analysis, and use a validation set consisting of trajectories from unseen subjects. For Total Capture, we use the train-validation split proposed by Trumble et al. (2017).
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used for the experiments.
Experiment Setup	Yes	All our models are trained end-to-end to minimize MSE loss as explained in Section 3.1. ... All models are trained for 600 epochs at a constant learning rate of 1e-3. ... Hyperparameter sweep ranges for each of our models and baselines, along with the resulting range of parameter counts are listed in Appendix B. ... Table 5. Hyperparameters for flat architectures ... Table 6. Hyperparameters for low-level models used in hierarchical architectures