reproducibilityindex.ai

Class Prior Estimation in Active Positive and Unlabeled Learning

Authors: Lorenzo Perini, Vincent Vercruyssen, Jesse Davis

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show that our approach accurately recovers the true class prior on a benchmark of anomaly detection datasets and that it does so more accurately than existing methods. and 5 Experiments We empirically evaluate the effectiveness of CAPE to recover the true class prior in the context of anomaly detection because it matches our setting: a handful of normal (positive) labels are acquired through an active learning strategy, the remaining examples are unlabeled.
Researcher Affiliation	Academia	Lorenzo Perini , Vincent Vercruyssen and Jesse Davis DTAI Research group, KU Leuven, Belgium lorenzo.perini@kuleuven.be, vincent.vercruyssen@kuleuven.be, jesse.davis@kuleuven.be
Pseudocode	No	The paper describes methods using prose and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	5Code: https://github.com/Lorenzo-Perini/Active PU Learning
Open Datasets	Yes	Data. The benchmark consists of 9 standard anomaly detection datasets from [Campos et al., 2016]. The datasets are listed in Table 1. They contain more normals than anomalies with normal class priors varying between 0.64 and 0.99. 6Data: www.dbs.iﬁ.lmu.de/research/outlier-evaluation
Dataset Splits	Yes	First, the dataset is split into training and test sets using a stratiﬁed 5-fold split.
Hardware Specification	No	The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	SSDO with its default parameters is used as the semi-supervised anomaly detector [Vercruyssen et al., 2018]. ... We use ISOLATION FOREST [Liu et al., 2008] as its unsupervised prior. ... We use uncertainty sampling as active learning strategy [Settles, 2012]. We model the user s uncertainty using the the kernel density estimate as implemented in SCIKIT-LEARN.
Experiment Setup	Yes	SSDO with its default parameters is used as the semi-supervised anomaly detector [Vercruyssen et al., 2018]. and The parameters of TICE, KM1, and KM2 are set to the values recommended in the original papers. and CAPE has only one hyperparameter: the range of cardinalities m in the outer loop, which is minimally 1 and maximally n (the cardinality of the dataset). In the experiments, we set the range to n {0.02, 0.04, 0.06, . . . , 0.4, 0.5, . . . 0.9}. and The process stops when 150 examples are labeled.