reproducibilityindex.ai

Linear-Time Outlier Detection via Sensitivity

Authors: Mario Lucic, Olivier Bachem, Andreas Krause

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In an extensive experimental evaluation, we demonstrate the effectiveness and establish the statistical signiﬁcance of the proposed approach. In particular, it outperforms the most popular distance-based approaches while being several orders of magnitude faster.
Researcher Affiliation	Academia	Mario Lucic ETH Zurich lucic@inf.ethz.ch Olivier Bachem ETH Zurich olivier.bachem@inf.ethz.ch Andreas Krause ETH Zurich krausea@ethz.ch
Pseudocode	Yes	Algorithm 1 INFLUENCE and Algorithm 2 DISTRIBUTED INFLUENCE are provided.
Open Source Code	No	The paper mentions implementation details in a footnote: 'The algorithms are implemented in Python 2.7 using Num Py and Sci Py libraries and Cython for performance critical operations.' However, it does not state that the code is publicly available or provide a link.
Open Datasets	Yes	The experimental evaluation is applied on a variety of real-world data sets available on UCI [Asuncion and Newman, 2007] as well as on synthetic data sets. ... The relevant information is summarized in Table 2.
Dataset Splits	No	The paper mentions evaluation using AUPRC but does not explicitly provide details about train/validation/test splits (e.g., percentages or specific files) for the datasets used in experiments.
Hardware Specification	Yes	The experiments were ran on Intel Xeon 3.3GHz machine with 36 cores and 1.5TB of RAM.
Software Dependencies	Yes	The algorithms are implemented in Python 2.7 using Num Py and Sci Py libraries and Cython for performance critical operations.
Experiment Setup	Yes	Parameters. We follow the parameter settings commonly used or suggested by the authors. For KNN and LOF we set k = 10 and k = 5, respectively [Bay and Schwabacher, 2003; Bhaduri et al., 2011; Orair et al., 2010]. For both ONE-TIME SAMPLING and ITERATIVE SAMPLING we set s = 20 and additionally k = 5 for ITERATIVE SAMPLING [Sugiyama and Borgwardt, 2013]. As our proposal, we apply Algorithm 1 with model averaging and k 2 [15 i=1{500/i}. For each algorithm with a random selection process we average 30 runs and we present the mean and variance of the AUPRC score.