reproducibilityindex.ai

Hidden Markov Anomaly Detection

Authors: Nico Goernitz, Mikio Braun, Marius Kloft

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The empirical evaluation on artiﬁcial and real data from the domains of computational biology and computational sustainability shows that the approach can achieve signiﬁcantly higher anomaly detection performance than the regular one-class SVM. We conducted experiments for the scenario of label sequence learning where we have full access to the ground truth as well as two real-world scenarios from computational biology and computational sustainability.
Researcher Affiliation	Academia	Nico G ornitz NICO.GOERNITZ@TU-BERLIN.DE Berlin Institute of Technology, 10587 Berlin, Germany Mikio Braun MIKIO.BRAUN@TU-BERLIN.DE Berlin Institute of Technology, 10587 Berlin, Germany Marius Kloft KLOFT@HU-BERLIN.DE Humboldt University of Berlin, 10099 Berlin, Germany
Pseudocode	Yes	Algorithm 1 Hidden Markov Anomaly Detection
Open Source Code	No	The paper does not contain any explicit statement about releasing the source code for the methodology, nor does it provide a direct link to a code repository.
Open Datasets	Yes	We downloaded the genome of the widely studied escherichia coli bacteria, which is publicly available. The footnote for this statement is: http://www.sanger.ac.uk... .../resources/downloads/bacteria/escherichia-coli.html. We used the wind turbine simulator FAST (Jonkman et al., 2005) to generate simulated sensor readings. The weather conditions, i.e., wind speed and turbulence are modeled by the wind turbulence simulator Turb Sim (Jonkman & Buhl, 2012).
Dataset Splits	Yes	From this data we selected half for training with various anomalous data fraction and the remaining for testing. The training set contained 200 examples of intergenic and genic examples with a total length of >170.000 nucleotides, while the testing set contained 350 intergenic and 50 genic examples of length >330.000 nucleotides, rending this a computationally challenging experiment.
Hardware Specification	No	The paper mentions computational runtime performance ('computational runtime is higher than for vanilla OC-SVMs') but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instance types) used to conduct the experiments.
Software Dependencies	No	The paper references external tools like 'wind turbine simulator FAST (Jonkman et al., 2005)' and 'wind turbulence simulator Turb Sim (Jonkman & Buhl, 2012)' (which has a cited user guide with 'Version 1.50'). However, it does not specify the versions of any programming languages, libraries, or solvers directly used in the implementation of their proposed method that would enable reproducibility.
Experiment Setup	No	The paper describes some model choices and optimal kernel parameters (e.g., 'binary state model consisting of 2 states and 4 possible transitions with an constant prior δ()', 'optimal kernel parameters (1.0 for the RBF kernel, 8 for the histogram kernel, and l1 for the linear kernel)'). However, it does not provide a comprehensive set of hyperparameters (e.g., learning rate, batch size, number of epochs, specific optimizer settings, or initialization details) that are typically required to fully reproduce the experimental setup.