Midas: Microcluster-Based Detector of Anomalies in Edge Streams
Authors: Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, Christos Faloutsos3242-3249
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results show that MIDAS outperforms baseline approaches by 46%-52% accuracy (in terms of AUC), and processes the data 108 505 times faster than baseline approaches. |
| Researcher Affiliation | Academia | 1National University of Singapore, 2Carnegie Mellon University, 3KAIST {siddharth, bhooi}@comp.nus.edu.sg, {minjiy, christos}@cs.cmu.edu, kijungs@kaist.ac.kr |
| Pseudocode | Yes | Algorithm 1: MIDAS: Streaming Anomaly Scoring; Algorithm 2: MIDAS-R: Incorporating Relations |
| Open Source Code | Yes | Reproducibility: Our code and datasets are publicly available at https://github.com/bhatiasiddharth/MIDAS. |
| Open Datasets | Yes | Datasets: DARPA (Lippmann et al. 1999) has 4.5M IPIP communications... Twitter Security (Rayana and Akoglu 2015; 2016)... Twitter World Cup (Rayana and Akoglu 2015; 2016)... |
| Dataset Splits | No | No explicit training, validation, and test dataset splits (e.g., percentages or absolute counts) are mentioned. The paper describes using datasets with ground truth for evaluation but not typical model training splits. |
| Hardware Specification | Yes | All experiments are carried out on a 2.7GHz Intel Core i5 processor, 16GB RAM, running OS X 10.14.6. |
| Software Dependencies | No | The paper mentions 'We implement MIDAS and MIDAS-R in C++' but does not specify the version of the compiler or any libraries used. It also mentions using 'an open-sourced implementation of SEDANSPOT' without providing its version. |
| Experiment Setup | Yes | We use 2 hash functions for the CMS data structures, and we set the number of CMS buckets to 2719 to result in an approximation error of ν = 0.001. For MIDAS-R, we set the temporal decay factor α as 0.5. We used an open-sourced implementation of SEDANSPOT, provided by the authors, following parameter settings as suggested in the original paper (sample size 500). |