Class Prior Estimation in Active Positive and Unlabeled Learning
Authors: Lorenzo Perini, Vincent Vercruyssen, Jesse Davis
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that our approach accurately recovers the true class prior on a benchmark of anomaly detection datasets and that it does so more accurately than existing methods. and 5 Experiments We empirically evaluate the effectiveness of CAPE to recover the true class prior in the context of anomaly detection because it matches our setting: a handful of normal (positive) labels are acquired through an active learning strategy, the remaining examples are unlabeled. |
| Researcher Affiliation | Academia | Lorenzo Perini , Vincent Vercruyssen and Jesse Davis DTAI Research group, KU Leuven, Belgium lorenzo.perini@kuleuven.be, vincent.vercruyssen@kuleuven.be, jesse.davis@kuleuven.be |
| Pseudocode | No | The paper describes methods using prose and mathematical equations but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 5Code: https://github.com/Lorenzo-Perini/Active PU Learning |
| Open Datasets | Yes | Data. The benchmark consists of 9 standard anomaly detection datasets from [Campos et al., 2016]. The datasets are listed in Table 1. They contain more normals than anomalies with normal class priors varying between 0.64 and 0.99. 6Data: www.dbs.ifi.lmu.de/research/outlier-evaluation |
| Dataset Splits | Yes | First, the dataset is split into training and test sets using a stratified 5-fold split. |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | SSDO with its default parameters is used as the semi-supervised anomaly detector [Vercruyssen et al., 2018]. ... We use ISOLATION FOREST [Liu et al., 2008] as its unsupervised prior. ... We use uncertainty sampling as active learning strategy [Settles, 2012]. We model the user s uncertainty using the the kernel density estimate as implemented in SCIKIT-LEARN. |
| Experiment Setup | Yes | SSDO with its default parameters is used as the semi-supervised anomaly detector [Vercruyssen et al., 2018]. and The parameters of TICE, KM1, and KM2 are set to the values recommended in the original papers. and CAPE has only one hyperparameter: the range of cardinalities m in the outer loop, which is minimally 1 and maximally n (the cardinality of the dataset). In the experiments, we set the range to n {0.02, 0.04, 0.06, . . . , 0.4, 0.5, . . . 0.9}. and The process stops when 150 examples are labeled. |