Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels

Authors: Shujian Yu, Xiaoyang Wang, José C. Príncipe

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Two sets of experiments are performed to evaluate the performance of HHT-CU and HHT-AG. First, quantitative metrics and plots are presented to demonstrate HHT-CU and HHTAG s effectiveness and superiority over state-of-the-art approaches on benchmark synthetic data. Then, we validate, via three real-world applications, the effectiveness of the proposed HHT-CU and HHT-AG on streaming data classification and the accuracy of its detected concept drift points.
Researcher Affiliation Collaboration 1 Nokia Bell Labs, Murray Hill, NJ, USA 2 Dept. of Electrical and Computer Engineering, University of Florida, Gainesville, FL, USA
Pseudocode Yes Algorithm 1 HHT with Classification Uncertainty (HHT-CU) and Algorithm 2 HHT with Attribute-wise Goodness of fit (HHT-AG)
Open Source Code No The paper does not provide an explicit statement or link for the open-source code of the described methodology.
Open Datasets Yes Eight datasets are selected from [Souza et al., 2015; Dyer et al., 2014], namely 2CDT, 2CHT, UG-2C-2D, MG-2C-2D, 4CR, 4CRE-V1, 4CE1CF, 5CVT. Three widely used real-world datasets are selected, namely USENET1 [Katakis et al., 2008], Keystroke [Souza et al., 2015] and Posture [Kaluˇza et al., 2010].
Dataset Splits No The paper describes a streaming data setup with a sliding window approach and initial classifier training, but does not specify explicit training/validation/test dataset splits with percentages or sample counts.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using 'soft margin SVM as the baseline classifier' but does not provide specific software dependencies with version numbers.
Experiment Setup Yes We use the parameters recommended in the papers for each competing method. The detailed values on significance levels or thresholds (if there exist) are shown in Table 1.