Towards Lightweight, Model-Agnostic and Diversity-Aware Active Anomaly Detection

Authors: Xu Zhang, Yuan Zhao, Ziang Cui, Liqun Li, Shilin He, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted extensive experiments on 8 public AD datasets to evaluate the effectiveness of our proposed method. The experimental results show that LMADA can achieve 74% F1-Score improvement on average, outperforming other comparative AAD approaches under the same feedback sample budget. In addition, we also validated that LMADA works well under various unsupervised anomaly detectors.
Researcher Affiliation Collaboration 1Microsoft Research, 2Peking University, 3Southeast University, 4Microsoft Azure, 5Microsoft 365
Pseudocode No The paper describes the methods in prose and equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper references source code for comparative methods (Meta-AAD, FIF) but does not provide an explicit statement or link for the open-source code of their proposed LMADA method.
Open Datasets Yes We used eight public datasets for the evaluation. Page Blocks, Annthyroid, Cardio, Cover, Mammography, Shuttle are available in ODDS 2. KDD99-Http and KDD99-SA are available in UCI Machine Learning Repository3. Page Blocks can be referred to ADBench 4.
Dataset Splits No The paper mentions 'train' and 'test' implicitly through its feedback iteration process and mentions using labeled samples for training a 'representation adjuster', but it does not specify a distinct validation dataset split for hyperparameter tuning or model selection.
Hardware Specification Yes In our experiments, we set up a Virtual Machine (VM) with 64 Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz processors and 256GB RAM. The operating system is Ubuntu-20.04. In the VM, we had an NVIDIA Tesla M40 GPU with CUDA 11.4 for deep learning model training.
Software Dependencies Yes We built LMADA based on Py Torch 1.12.0 (Paszke et al., 2019) and used base unsupervised anomaly detectors implemented in Py OD 1.0.3 (Zhao et al., 2019).
Experiment Setup Yes For the sample selector of LMADA, we set the pre-truncation rate α = 10%. We introduce two hyper-parameters λ and γ to adjust the preference of anomaly score and diversity (Lij = (aiaj)λ (rirj si, sj )γ). In the experiments, we set λ = 1 and γ = 1. In the model tuner, we utilized the Adam optimizer (Kingma & Ba, 2014) and set the epoch number to 10, the learning rate to 0.01, and the batch size to 512, for both the proxy model approximation phase and the representation adjuster tuning phase. The size of the proxy model hidden layer is set to 64.