reproducibilityindex.ai

Stochastic Online Anomaly Analysis for Streaming Time Series

Authors: Zhao Xu, Kristian Kersting, Lorenzo von Ritter

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical analysis on real-world datasets demonstrates the effectiveness of our method. We verify the proposed OLAD method on both real-world and synthetic data. We ﬁrst evaluate the performance of the OLAD in an online anomaly detection scenario. Experiments on network trafﬁc data: We use the Yahoo dataset of real network trafﬁc to some of the Yahoo services (https://webscope.sandbox.yahoo.com/catalog.php? datatype=s&did=70). Experiments on ﬁnancial data: We also validate the OLAD method with the S&P 500 index data (https://fred. stlouisfed.org/series/SP500) from January 2012 to January 2017. Experiments on synthetic data: We further conduct some supplementary experiments to evaluate the predictive performance of the OLAD method in learning the underlying dynamics of the contaminated time series with the simulated data.
Researcher Affiliation	Collaboration	1NEC Labs Europe, Germany 2Technical University of Darmstadt, Germany 3Technical University of Munich, Germany
Pseudocode	Yes	Algorithm 1: Online one-step ahead prediction for streaming time series
Open Source Code	No	The paper mentions the Twitter Anomaly Detection repository (https://github.com/twitter/Anomaly Detection) for a baseline method, but does not provide a link or explicit statement for its own methodology's source code.
Open Datasets	Yes	Experiments on network trafﬁc data: We use the Yahoo dataset of real network trafﬁc to some of the Yahoo services (https://webscope.sandbox.yahoo.com/catalog.php? datatype=s&did=70). Experiments on ﬁnancial data: We also validate the OLAD method with the S&P 500 index data (https://fred. stlouisfed.org/series/SP500) from January 2012 to January 2017.
Dataset Splits	No	The paper does not specify explicit training, validation, and test splits with percentages or counts. It mentions using the first T=100 time steps as initialization for the Yahoo dataset, but this is not a general train/validation/test split for the entire dataset.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions using specific methods and baselines like 'HESD [Vallis et al., 2014]' and 'GPEVT [Smith et al., 2012]' and 'GP [Rasmussen and Williams, 2006]', but does not specify any software libraries or dependencies with version numbers used for implementation.
Experiment Setup	Yes	For each time series, the observations collected at the ﬁrst T = 100 time steps are viewed as initialization. At each time step t after the initial period, we make a one-step ahead prediction for the next step t + 1 using the OLAD method. If the real observation yt+1 fells far outside the 99.99% predictive interval, then the observation at time t + 1 is identiﬁed as an anomaly event. We set the parameters of the kernel function as: ρ = 1.0 and ℓ= exp(2.0). The length of the time series was n = 100. We assume 30 time steps observed, and predict the remaining part of the time series. For the observed time steps, we randomly add m = 0, 1, . . . , 5 outliers.