Online Boosting Adaptive Learning under Concept Drift for Multistream Classification
Authors: En Yu, Jie Lu, Bin Zhang, Guangquan Zhang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments on several synthetic and real-world data streams, encompassing various drifting scenarios and types. The results clearly demonstrate that OBAL achieves remarkable advancements in addressing multistream classification problems by effectively leveraging positive knowledge derived from multiple sources. In the experiment, we first empirically demonstrated that OBAL consistently outperforms current methods, highlighting both robustness and superiority. Second, we validated the substantial impact of dynamic inter-stream relationships on prediction, emphasizing the effectiveness of the Ada COSA by ablation study. Additionally, we confirmed OBAL s scalability across various data streams, validating its consistent predictive performance. Finally, we assessed parameter sensitivity, time complexity, and algorithmic cost. |
| Researcher Affiliation | Academia | En Yu, Jie Lu*, Bin Zhang, Guangquan Zhang Decision Systems and e-Service Intelligence Laboratory, Australian Artificial Intelligence Institute (AAII), Faculty of Engineering and Information Technology, University of Technology Sydney, Australia En.Yu@student.uts.edu.au; {Jie.Lu, Bin.Zhang, Guangquan.Zhang}@uts.edu.au |
| Pseudocode | Yes | Algorithm 1: Initialization (Ada COSA) and Algorithm 2: The learning process of OBAL |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Benchmark Datasets. We conduct the experiment on four synthetic datasets (i.e., SEA (Street and Kim 2001), Tree (Liu, Lu, and Zhang 2020), RBF (Song et al. 2021a), and Hyperplane (Bifet and Gavalda 2007) ) and four popular real-world datasets (Weather (Ditzler and Polikar 2012), Kitti (Geiger, Lenz, and Urtasun 2012), CNNIBN (Vyas et al. 2014), and BBC (Vyas et al. 2014)), and more detailed descriptions of each dataset and multistream scenario simulation can be found in Supplementary S3 and Table S1. |
| Dataset Splits | No | The paper discusses training and initial sample sizes, but it does not provide specific details on training, validation, and test dataset splits (e.g., percentages, sample counts, or explicit mention of a validation set). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU specifications, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions various algorithms and methods used (e.g., DDM, GMM, CORAL, EM algorithm) but does not provide specific version numbers for any software libraries, frameworks, or programming languages used in the implementation. |
| Experiment Setup | Yes | In the proposed OBAL, there are three main parameters affecting the classification performance, including the window size of the initialization stage Ln, the re-weighting steps Imax, and the maximum classifier pool size |P|. To analyze their impact on the overall performance, we carry out experiments under various values of all parameters on all datasets. Here, we set Ln {100, 200, 300, 400, 500}, Imax {1, 3, 5, 7, 10} and |P| {1, 5, 10, 15, 20}. ... Detailed parameter settings are shown in Table S2 in the supplementary. |