Unsupervised Anomaly Detection with Rejection
Authors: Lorenzo Perini, Jesse Davis
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally address the following research questions: Q1. How does REJEX s cost compare to the baselines? Q2. How does varying the cost function affect the results? Q3. How does REJEX s CPU time compare to the baselines? Q4. Do the theoretical results hold in practice? |
| Researcher Affiliation | Academia | Lorenzo Perini DTAI lab & Leuven.AI, KU Leuven, Belgium lorenzo.perini@kuleuven.be Jesse Davis, DTAI lab & Leuven.AI, KU Leuven, Belgium jesse.davis@kuleuven.be |
| Pseudocode | No | The paper includes mathematical formulations and proofs (e.g., Theorem 3.1, Theorem 3.5, Theorem 3.6, Theorem 3.8), but it does not feature any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code available at: https://github.com/Lorenzo-Perini/RejEx. |
| Open Datasets | Yes | We carry out our study on 34 publicly available benchmark datasets, widely used in the literature [23]. These datasets can be downloaded in the following link: https://github.com/Minqi824/ADBench/tree/main/datasets/Classical. |
| Dataset Splits | Yes | (1) we split the dataset into training and test sets (80-20) using 5 fold cross-validation; ... This uses (any) unsupervised detector to obtain pseudo labels for the training set. It then sets the rejection threshold as follows: 1) it creates a held-out validation set (20%) |
| Hardware Specification | Yes | All experiments were run on an Intel(R) Xeon(R) Silver 4214 CPU. |
| Software Dependencies | No | The paper mentions software tools like PYOD [66] and Bayesian Optimization [17] but does not specify their version numbers or the version of the programming language (e.g., Python) used. |
| Experiment Setup | Yes | We set our tolerance ε = 2e T with T = 32. Note that the exponential smooths out the effect of T 4, which makes setting a different T have little impact. We use a set of 12 unsupervised anomaly detectors implemented in PYOD [66] with default hyperparameters [62] because the unsupervised setting does not allow us to tune them: KNN [3], IFOREST [42], LOF [5], OCSVM [58], AE [8], HBOS [21], LODA [53], COPOD [39], GMM [2], ECOD [40], KDE [36], INNE [4]. We set all the baselines rejection threshold via Bayesian Optimization with 50 calls [17]. |