Optimal and Adaptive Algorithms for Online Boosting

Authors: Alina Beygelzimer, Satyen Kale, Haipeng Luo

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental While the focus of this paper is a theoretical investigation of online boosting, we also performed an experimental evaluation. We extended the Vowpal Wabbit open source machine learning system 4 to include the algorithms studied in this paper. [...] All experiments were done on a diverse collection of 13 publicly available datasets. For each dataset, we performed a random split with 80% of the data used for single-pass training and the remaining 20% for testing. We tuned the learning rate, the number of weak learners, and the edge parameter γ (for all but the two versions of Ada Boost.OL) using progressive validation 0-1 loss on the training set. [...] Reported in Table 1 is the 0-1 loss on the test set.
Researcher Affiliation Collaboration Alina Beygelzimer beygel@yahoo-inc.com Yahoo Research New York, NY 10036 Satyen Kale satyen@yahoo-inc.com Yahoo Research New York, NY 10036 Haipeng Luo haipengl@cs.princeton.edu Princeton University Princeton, NJ 08540
Pseudocode Yes Algorithm 1 Online BBM [...] Algorithm 2 Ada Boost.OL
Open Source Code No The paper states: 'We extended the Vowpal Wabbit open source machine learning system 4 to include the algorithms studied in this paper.' and footnote 4 links to 'https://github.com/John Langford/vowpal wabbit/wiki'. This indicates they used and extended an existing open-source system (VW), but it does not explicitly state that *their specific implementations or extensions* for the algorithms presented in this paper are made open-source with a direct link to *their* code.
Open Datasets No The paper states 'All experiments were done on a diverse collection of 13 publicly available datasets.' and lists dataset names (e.g., '20news', 'a9a', 'adult'), but it does not provide concrete access information such as URLs, DOIs, specific repository names, or formal citations with authors and years for these datasets.
Dataset Splits Yes For each dataset, we performed a random split with 80% of the data used for single-pass training and the remaining 20% for testing. We tuned the learning rate, the number of weak learners, and the edge parameter γ (for all but the two versions of Ada Boost.OL) using progressive validation 0-1 loss on the training set. Progressive validation is a standard online validation technique, where each training example is used for testing before it is used for updating the model [Blum et al., 1999].
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. It only mentions 'VW' which is software.
Software Dependencies No The paper mentions extending 'Vowpal Wabbit open source machine learning system' but does not specify any version number for VW or any other software dependencies.
Experiment Setup Yes We tuned the learning rate, the number of weak learners, and the edge parameter γ (for all but the two versions of Ada Boost.OL) using progressive validation 0-1 loss on the training set.