Automatic Unsupervised Outlier Model Selection

Authors: Yue Zhao, Ryan Rossi, Leman Akoglu

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that selecting a model by METAOD significantly outperforms no model selection (e.g. always using the same popular model or the ensemble of many) as well as other meta-learning techniques that we tailored for UOMS.
Researcher Affiliation Collaboration Yue Zhao Carnegie Mellon University zhaoy@cmu.edu Ryan A. Rossi Adobe Research ryrossi@adobe.com Leman Akoglu Carnegie Mellon University lakoglu@andrew.cmu.edu
Pseudocode Yes We also provide the detailed steps of METAOD in pseudo-code, for both meta-training (offline) and model selection (online), in Appendix D Algo. 1.
Open Source Code Yes We open-source1 METAOD and our meta-learning database for practical use and to foster further research on the UOMS problem. 1Code available at URL: https://github.com/yzhao062/UOMS
Open Datasets Yes 1. Proof-of-Concept (POC) testbed contains 100 datasets that form clusters of similar datasets, where 5 different detection tasks ( siblings ) are created from each one of 20 mothersets . 2. Stress Testing (ST) testbed consists of 62 independent datasets from 3 different public-domain OD dataset repositories , which exhibit relatively lower similarity to one another. We use the benchmark datasets4 by Emmott et al. [11], who created childsets from 20 independent mothersets by sampling. 4https://ir.library.oregonstate.edu/concern/datasets/47429f155
Dataset Splits Yes We split them into 5 folds for cross-validation, each test fold containing 20 independent childsets without siblings. For evaluation on ST, we use leave-one-out cross validation; each time using 61 datasets as meta-train.
Hardware Specification Yes All models are built using the Py OD library [61] on an Intel i7-9700 @3.00 GHz, 64GB RAM, 8-core workstation.
Software Dependencies No The paper mentions "Py OD library [61]" but does not provide specific version numbers for it or any other ancillary software dependencies.
Experiment Setup Yes We pair 8 SOTA OD algorithms and their corresponding hyperparameters to compose a model set M with 302 unique models. (See Appendix A Table 2 for the complete list.)