Adaptive Ensemble Active Learning for Drifting Data Stream Mining

Authors: Bartosz Krawczyk, Alberto Cano

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This experimental study was designed to answer four research questions that will provide an insight into the proposed EALMAB approach:
Researcher Affiliation Academia 1Department of Computer Science, Virginia Commonwealth University, Richmond VA, USA {bkrawczyk,acano}@vcu.edu
Pseudocode No No section or figure explicitly labeled "Pseudocode" or "Algorithm".
Open Source Code No No explicit statement or link indicating the release of source code for the described methodology.
Open Datasets Yes For the purpose of evaluating our proposed algorithm, we generated 10 diverse and large-scale data stream benchmarks using MOA environment [Bifet et al., 2010a], as well as two popular real-world data streams. By using data stream generators, we were able to fully control the nature and occurrence of concept drifts, which in turn leads to a more explainable experimental study. Details of used data streams are given in Table 1.
Dataset Splits Yes Validation set. Used ensemble learning algorithm must be capable of evaluating the generalization metric (see Eq. 16) for each base classifier on instances unseen by this classifier. (...) For evaluating the generalization capabilities of base classifiers in the ensemble (see Eq. 16), we use a window ωV = 50 most recently labeled instances.
Hardware Specification No No specific hardware details (GPU/CPU models, memory, etc.) are provided for the experimental setup.
Software Dependencies No The paper mentions software like MOA environment and specific algorithms (Leveraging Bagging, Online Bagging, Accuracy Updated Ensemble, Hoeffding Trees, Naive Bayes) but does not provide specific version numbers for any of them.
Experiment Setup Yes We restrict the size of Leveraging Bagging to 10 base classifiers (as suggested by authors) and use Hoeffding Trees as base learners. (...) AL budgets. As we want for our experimental study to reflect a real-world scenario, we investigate small to medium budgets, from highly restrictive 1% to 30%, so B {0.01, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3}. (...) We use a window size ω = 1000 for calculating the budgets, prequential metrics, and training new classifiers for ensembles. For evaluating the generalization capabilities of base classifiers in the ensemble (see Eq. 16), we use a window ωV = 50 most recently labeled instances.