Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models

Authors: Ali Behrouz, Michele Santacatterina, Ramin Zabih

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental evaluation shows the superior performance of Chimera on extensive and diverse benchmarks, including ECG and speech time series classification, long-term and short-term time series forecasting, and time series anomaly detection. (...) 4 Experiments. Goals and Baselines. We evaluate Chimera on a wide range of time series tasks.
Researcher Affiliation Academia Ali Behrouz Cornell University ab2947@cornell.edu Michele Santacatterina New York University santam13@nyu.edu Ramin Zabih Cornell University rdz@cs.cornell.edu
Pseudocode No The paper discusses an algorithm for 2D selective scan and an operator, but it does not present structured pseudocode or an explicitly labeled algorithm block.
Open Source Code No The paper does not provide a statement about open-sourcing its code or a link to a code repository.
Open Datasets Yes We perform experiments in long-term forecasting task on benchmark datasets [6]. Table 1 reports the average of results over different horizons (for the results of each see Table 8). Chimera shows outstanding performance, achieving the best or the second best results in all the datasets and outperforms baselines in 5 out of 8 benchmarks. Notably, it surpasses extensively studied MLP-based and Transformer-based models while being more efficient (see Table 3, Figure 4, and Appendix I), providing a better balance of performance and efficiency. It further significantly outperforms recurrent models, including very recent Mamba-based architectures [31, 57], unleashing the potential of classical models, SSMs, when are carefully designed in deep learning settings.
Dataset Splits Yes Table 7: Dataset descriptions. The dataset size is organized in (Train, Validation, Test). ETTm1, ETTm2 (...) (34465, 11521, 11521)
Hardware Specification No The paper discusses efficiency (e.g., 'faster training', 'less memory consumption') and wall-clock scaling but does not specify the exact hardware (e.g., GPU/CPU models, memory) used for experiments.
Software Dependencies No The paper does not provide specific software dependencies (e.g., library names with version numbers) needed to replicate the experiments.
Experiment Setup No The paper describes model architecture details but does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed training configurations in the main text or its appendices.