Adaptive Ensemble Active Learning for Drifting Data Stream Mining
Authors: Bartosz Krawczyk, Alberto Cano
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This experimental study was designed to answer four research questions that will provide an insight into the proposed EALMAB approach: |
| Researcher Affiliation | Academia | 1Department of Computer Science, Virginia Commonwealth University, Richmond VA, USA {bkrawczyk,acano}@vcu.edu |
| Pseudocode | No | No section or figure explicitly labeled "Pseudocode" or "Algorithm". |
| Open Source Code | No | No explicit statement or link indicating the release of source code for the described methodology. |
| Open Datasets | Yes | For the purpose of evaluating our proposed algorithm, we generated 10 diverse and large-scale data stream benchmarks using MOA environment [Bifet et al., 2010a], as well as two popular real-world data streams. By using data stream generators, we were able to fully control the nature and occurrence of concept drifts, which in turn leads to a more explainable experimental study. Details of used data streams are given in Table 1. |
| Dataset Splits | Yes | Validation set. Used ensemble learning algorithm must be capable of evaluating the generalization metric (see Eq. 16) for each base classifier on instances unseen by this classifier. (...) For evaluating the generalization capabilities of base classifiers in the ensemble (see Eq. 16), we use a window ωV = 50 most recently labeled instances. |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, memory, etc.) are provided for the experimental setup. |
| Software Dependencies | No | The paper mentions software like MOA environment and specific algorithms (Leveraging Bagging, Online Bagging, Accuracy Updated Ensemble, Hoeffding Trees, Naive Bayes) but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | We restrict the size of Leveraging Bagging to 10 base classifiers (as suggested by authors) and use Hoeffding Trees as base learners. (...) AL budgets. As we want for our experimental study to reflect a real-world scenario, we investigate small to medium budgets, from highly restrictive 1% to 30%, so B {0.01, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3}. (...) We use a window size ω = 1000 for calculating the budgets, prequential metrics, and training new classifiers for ensembles. For evaluating the generalization capabilities of base classifiers in the ensemble (see Eq. 16), we use a window ωV = 50 most recently labeled instances. |