MEMTO: Memory-guided Transformer for Multivariate Time Series Anomaly Detection

Authors: Junho Song, Keonwoo Kim, Jeonglyul Oh, Sungzoon Cho

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our proposed method on five real-world datasets from diverse domains, and it achieves an average anomaly detection F1-score of 95.74%, significantly outperforming the previous state-of-the-art methods. We also conduct extensive experiments to empirically validate the effectiveness of our proposed model s key components.
Researcher Affiliation Collaboration Junho Song1 Keonwoo Kim1,2 Jeonglyul Oh1 Sungzoon Cho1 1Seoul National University 2VRCREW Inc.
Pseudocode Yes Algorithm 1 Memory module initialization with K-means clustering. Algorithm 2 Proposed Method MEMTO.
Open Source Code No The paper does not explicitly state that the source code for MEMTO is publicly available or provide a link to it.
Open Datasets Yes We evaluate MEMTO on five real-world multivariate time series datasets. (i) Server Machine Dataset (SMD [33])... (ii & iii) Mars Science Laboratory rover (MSL) and Soil Moisture Active Passive satellite (SMAP) are public data released from NASA [13]... (iv) Secure Water Treatment (SWa T [18])... (v) Pooled Server Metrics (PSM [1])... We obtained SWa T by submitting a request through https://itrust.sutd.edu.sg/itrust-labs_datasets/.
Dataset Splits Yes We split the training data into 80% for training and 20% for validation.
Hardware Specification Yes Our experiments are conducted using the Pytorch framework on four NVIDIA GTX 1080 Ti 12GB GPUs.
Software Dependencies No Our experiments are conducted using the Pytorch framework on four NVIDIA GTX 1080 Ti 12GB GPUs.
Experiment Setup Yes We set λ in the objective function to 0.01, use Adam optimizer [15] with a learning rate of 5e-5, and employ early stopping with the patience of 10 epochs against the validation loss during training. Our experiments are conducted using the Pytorch framework on four NVIDIA GTX 1080 Ti 12GB GPUs. Furthermore, during the execution of our experiment, we make partial references to the code of [40]. We performed a grid search to determine the values of each hyperparameter within the following range: λ {1e+0, 5e-1, 1e-1, 5e-2, 1e-2, 5e-3, 1e-3} lr {1e-4, 3e-4, 5e-4, 1e-5, 3e-5, 5e-5} τ {0.1, 0.3, 0.5, 0.7, 0.9} M {5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100}.