Unsupervised Anomaly Detection with Rejection

Authors: Lorenzo Perini, Jesse Davis

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally address the following research questions: Q1. How does REJEX s cost compare to the baselines? Q2. How does varying the cost function affect the results? Q3. How does REJEX s CPU time compare to the baselines? Q4. Do the theoretical results hold in practice?
Researcher Affiliation Academia Lorenzo Perini DTAI lab & Leuven.AI, KU Leuven, Belgium lorenzo.perini@kuleuven.be Jesse Davis, DTAI lab & Leuven.AI, KU Leuven, Belgium jesse.davis@kuleuven.be
Pseudocode No The paper includes mathematical formulations and proofs (e.g., Theorem 3.1, Theorem 3.5, Theorem 3.6, Theorem 3.8), but it does not feature any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Code available at: https://github.com/Lorenzo-Perini/RejEx.
Open Datasets Yes We carry out our study on 34 publicly available benchmark datasets, widely used in the literature [23]. These datasets can be downloaded in the following link: https://github.com/Minqi824/ADBench/tree/main/datasets/Classical.
Dataset Splits Yes (1) we split the dataset into training and test sets (80-20) using 5 fold cross-validation; ... This uses (any) unsupervised detector to obtain pseudo labels for the training set. It then sets the rejection threshold as follows: 1) it creates a held-out validation set (20%)
Hardware Specification Yes All experiments were run on an Intel(R) Xeon(R) Silver 4214 CPU.
Software Dependencies No The paper mentions software tools like PYOD [66] and Bayesian Optimization [17] but does not specify their version numbers or the version of the programming language (e.g., Python) used.
Experiment Setup Yes We set our tolerance ε = 2e T with T = 32. Note that the exponential smooths out the effect of T 4, which makes setting a different T have little impact. We use a set of 12 unsupervised anomaly detectors implemented in PYOD [66] with default hyperparameters [62] because the unsupervised setting does not allow us to tune them: KNN [3], IFOREST [42], LOF [5], OCSVM [58], AE [8], HBOS [21], LODA [53], COPOD [39], GMM [2], ECOD [40], KDE [36], INNE [4]. We set all the baselines rejection threshold via Bayesian Optimization with 50 calls [17].