reproducibilityindex.ai

Temporal Anomaly Detection: Calibrating the Surprise

Authors: Eyal Gutflaish, Aryeh Kontorovich, Sivan Sabato, Ofer Biller, Oded Sofer3755-3762

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide a detailed description of the algorithm, including a convergence analysis, and report encouraging empirical results. One of the data sets that we tested is new for the public domain. It consists of two months worth of database access records from a live system. This data set and our code are publicly available at https://github.com/eyalgut/TLR anomaly detection.git.
Researcher Affiliation	Collaboration	1Ben-Gurion University of the Negev, Beer Sheva, Israel 2IBM Security Division, Israel
Pseudocode	Yes	Algorithm 1 Find Model(λ, S): Find model matrix ... Algorithm 2 Folded LL(Bt, ˆπ, G, H, U, V )
Open Source Code	Yes	This data set and our code are publicly available at https://github.com/eyalgut/TLR anomaly detection.git.
Open Datasets	Yes	One of the data sets that we tested is new for the public domain. It consists of two months worth of database access records from a live system. This data set and our code are publicly available at https://github.com/eyalgut/TLR anomaly detection.git. ... The second data set is from Amazon (Lichman 2013). ... We further tested on the movie-rating data sets Movie Lens (Harper and Konstan 2016) and Netﬂix (Bennett, Lanning, and others 2007).
Dataset Splits	Yes	We split S into two parts, S1 = (B1, . . . , BT1), S2 = (BT1+1, . . . , BT ). S1 is used to ﬁnd an estimator ˆπ for the probabilistic stationary model π, while S2 is used to ﬁt the log-likelihood regressor ˆw. ... k-fold cross-validation (k = 10) is performed to select λ Λ: In fold i, S1 is divided to a training part St 1(i) and a validation part Sv 1(i)
Hardware Specification	Yes	Table 2: Run-time (seconds) on an 2.8GHz Xeon CPU with 40 cores and 256 GB RAM.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	The available data sets do not contain known anomalous accesses. Thus, in our experiments we injected anomalous behavior into random intervals, as explained below. ... For our algorithm, we used the following natural timedependent features for regression: A binary weekend feature, the log-likelihood of the previous interval and of the one 24 hours ago (for TDA) or a week ago (for the others), the number of accesses in the current interval, the number of intervals since the last training set interval, day-of-the-week, and for TDA also hour of the day h {1, . . . , 24} and shifted hour of the day ((h + 12) mod 24).