Outlier Detection for Time Series with Recurrent Autoencoder Ensembles

Authors: Tung Kieu, Bin Yang, Chenjuan Guo, Christian S. Jensen

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments with two real-world time series data sets, including univariate and multivariate time series, offer insight into the design properties of the proposed ensemble frameworks and demonstrate that the proposed frameworks are capable of outperforming both baselines and the state-of-the-art methods.
Researcher Affiliation Academia Tung Kieu , Bin Yang , Chenjuan Guo and Christian S. Jensen Department of Computer Science, Aalborg University, Denmark {tungkvt, byang, cguo, csj}@cs.aau.dk
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The source code is available at https://github.com/tungk/OED.
Open Datasets Yes We use two real-world time series repositories, the univariate time series repository Numenta Anomaly Benchmark (NAB) and the multivariate time series repository Electrocardiography (ECG). https://github.com/numenta/NAB http://www.cs.ucr.edu/~eamonn/discords/ECG_data.zip/
Dataset Splits No The paper mentions 'training' and 'evaluating accuracy' but does not specify explicit training, validation, and test dataset splits (e.g., percentages or counts).
Hardware Specification Yes Experiments are performed on a Linux workstation with dual 12-core Xeon E5 CPUs, 64 GB RAM, and 2 K40M GPUs.
Software Dependencies Yes All algorithms are implemented in Python 3.5. Methods IF and SF and the deep learning methods, i.e., CNN, LSTM, and RSCN, are implemented using Tensor Flow 1.4.0, while the remaining methods, i.e., LOF, SVM, ISF, and MP are implemented using Scikit-learn 1.19.
Experiment Setup Yes For all deep learning based methods, we use Adadelta [Zeiler, 2012] as the optimizer, and we set their learning rates to 10-3. For the proposed ensemble frameworks, we use an LSTM unit and tanh as the functions f and f in Equation 5; we set the number of hidden LSTM units to 8; we set the default number of autoencoders N to 40, and we also study the effect of varying N from 10 to 40; and we set λ to 0.005. We randomly vary the skip connection jump step size L from 1 to 10, and we randomly choose the sparse weight vector wt.