Hidden Markov Anomaly Detection
Authors: Nico Goernitz, Mikio Braun, Marius Kloft
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The empirical evaluation on artificial and real data from the domains of computational biology and computational sustainability shows that the approach can achieve significantly higher anomaly detection performance than the regular one-class SVM. We conducted experiments for the scenario of label sequence learning where we have full access to the ground truth as well as two real-world scenarios from computational biology and computational sustainability. |
| Researcher Affiliation | Academia | Nico G ornitz NICO.GOERNITZ@TU-BERLIN.DE Berlin Institute of Technology, 10587 Berlin, Germany Mikio Braun MIKIO.BRAUN@TU-BERLIN.DE Berlin Institute of Technology, 10587 Berlin, Germany Marius Kloft KLOFT@HU-BERLIN.DE Humboldt University of Berlin, 10099 Berlin, Germany |
| Pseudocode | Yes | Algorithm 1 Hidden Markov Anomaly Detection |
| Open Source Code | No | The paper does not contain any explicit statement about releasing the source code for the methodology, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | We downloaded the genome of the widely studied escherichia coli bacteria, which is publicly available. The footnote for this statement is: http://www.sanger.ac.uk... .../resources/downloads/bacteria/escherichia-coli.html. We used the wind turbine simulator FAST (Jonkman et al., 2005) to generate simulated sensor readings. The weather conditions, i.e., wind speed and turbulence are modeled by the wind turbulence simulator Turb Sim (Jonkman & Buhl, 2012). |
| Dataset Splits | Yes | From this data we selected half for training with various anomalous data fraction and the remaining for testing. The training set contained 200 examples of intergenic and genic examples with a total length of >170.000 nucleotides, while the testing set contained 350 intergenic and 50 genic examples of length >330.000 nucleotides, rending this a computationally challenging experiment. |
| Hardware Specification | No | The paper mentions computational runtime performance ('computational runtime is higher than for vanilla OC-SVMs') but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instance types) used to conduct the experiments. |
| Software Dependencies | No | The paper references external tools like 'wind turbine simulator FAST (Jonkman et al., 2005)' and 'wind turbulence simulator Turb Sim (Jonkman & Buhl, 2012)' (which has a cited user guide with 'Version 1.50'). However, it does not specify the versions of any programming languages, libraries, or solvers directly used in the implementation of their proposed method that would enable reproducibility. |
| Experiment Setup | No | The paper describes some model choices and optimal kernel parameters (e.g., 'binary state model consisting of 2 states and 4 possible transitions with an constant prior δ()', 'optimal kernel parameters (1.0 for the RBF kernel, 8 for the histogram kernel, and l1 for the linear kernel)'). However, it does not provide a comprehensive set of hyperparameters (e.g., learning rate, batch size, number of epochs, specific optimizer settings, or initialization details) that are typically required to fully reproduce the experimental setup. |