Class Prior Estimation in Active Positive and Unlabeled Learning

Authors: Lorenzo Perini, Vincent Vercruyssen, Jesse Davis

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we show that our approach accurately recovers the true class prior on a benchmark of anomaly detection datasets and that it does so more accurately than existing methods. and 5 Experiments We empirically evaluate the effectiveness of CAPE to recover the true class prior in the context of anomaly detection because it matches our setting: a handful of normal (positive) labels are acquired through an active learning strategy, the remaining examples are unlabeled.
Researcher Affiliation Academia Lorenzo Perini , Vincent Vercruyssen and Jesse Davis DTAI Research group, KU Leuven, Belgium lorenzo.perini@kuleuven.be, vincent.vercruyssen@kuleuven.be, jesse.davis@kuleuven.be
Pseudocode No The paper describes methods using prose and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes 5Code: https://github.com/Lorenzo-Perini/Active PU Learning
Open Datasets Yes Data. The benchmark consists of 9 standard anomaly detection datasets from [Campos et al., 2016]. The datasets are listed in Table 1. They contain more normals than anomalies with normal class priors varying between 0.64 and 0.99. 6Data: www.dbs.ifi.lmu.de/research/outlier-evaluation
Dataset Splits Yes First, the dataset is split into training and test sets using a stratified 5-fold split.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No SSDO with its default parameters is used as the semi-supervised anomaly detector [Vercruyssen et al., 2018]. ... We use ISOLATION FOREST [Liu et al., 2008] as its unsupervised prior. ... We use uncertainty sampling as active learning strategy [Settles, 2012]. We model the user s uncertainty using the the kernel density estimate as implemented in SCIKIT-LEARN.
Experiment Setup Yes SSDO with its default parameters is used as the semi-supervised anomaly detector [Vercruyssen et al., 2018]. and The parameters of TICE, KM1, and KM2 are set to the values recommended in the original papers. and CAPE has only one hyperparameter: the range of cardinalities m in the outer loop, which is minimally 1 and maximally n (the cardinality of the dataset). In the experiments, we set the range to n {0.02, 0.04, 0.06, . . . , 0.4, 0.5, . . . 0.9}. and The process stops when 150 examples are labeled.