Stochastic Online Anomaly Analysis for Streaming Time Series
Authors: Zhao Xu, Kristian Kersting, Lorenzo von Ritter
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical analysis on real-world datasets demonstrates the effectiveness of our method. We verify the proposed OLAD method on both real-world and synthetic data. We first evaluate the performance of the OLAD in an online anomaly detection scenario. Experiments on network traffic data: We use the Yahoo dataset of real network traffic to some of the Yahoo services (https://webscope.sandbox.yahoo.com/catalog.php? datatype=s&did=70). Experiments on financial data: We also validate the OLAD method with the S&P 500 index data (https://fred. stlouisfed.org/series/SP500) from January 2012 to January 2017. Experiments on synthetic data: We further conduct some supplementary experiments to evaluate the predictive performance of the OLAD method in learning the underlying dynamics of the contaminated time series with the simulated data. |
| Researcher Affiliation | Collaboration | 1NEC Labs Europe, Germany 2Technical University of Darmstadt, Germany 3Technical University of Munich, Germany |
| Pseudocode | Yes | Algorithm 1: Online one-step ahead prediction for streaming time series |
| Open Source Code | No | The paper mentions the Twitter Anomaly Detection repository (https://github.com/twitter/Anomaly Detection) for a baseline method, but does not provide a link or explicit statement for its own methodology's source code. |
| Open Datasets | Yes | Experiments on network traffic data: We use the Yahoo dataset of real network traffic to some of the Yahoo services (https://webscope.sandbox.yahoo.com/catalog.php? datatype=s&did=70). Experiments on financial data: We also validate the OLAD method with the S&P 500 index data (https://fred. stlouisfed.org/series/SP500) from January 2012 to January 2017. |
| Dataset Splits | No | The paper does not specify explicit training, validation, and test splits with percentages or counts. It mentions using the first T=100 time steps as initialization for the Yahoo dataset, but this is not a general train/validation/test split for the entire dataset. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using specific methods and baselines like 'HESD [Vallis et al., 2014]' and 'GPEVT [Smith et al., 2012]' and 'GP [Rasmussen and Williams, 2006]', but does not specify any software libraries or dependencies with version numbers used for implementation. |
| Experiment Setup | Yes | For each time series, the observations collected at the first T = 100 time steps are viewed as initialization. At each time step t after the initial period, we make a one-step ahead prediction for the next step t + 1 using the OLAD method. If the real observation yt+1 fells far outside the 99.99% predictive interval, then the observation at time t + 1 is identified as an anomaly event. We set the parameters of the kernel function as: ρ = 1.0 and ℓ= exp(2.0). The length of the time series was n = 100. We assume 30 time steps observed, and predict the remaining part of the time series. For the observed time steps, we randomly add m = 0, 1, . . . , 5 outliers. |