Optimal and Adaptive Algorithms for Online Boosting
Authors: Alina Beygelzimer, Satyen Kale, Haipeng Luo
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | While the focus of this paper is a theoretical investigation of online boosting, we also performed an experimental evaluation. We extended the Vowpal Wabbit open source machine learning system 4 to include the algorithms studied in this paper. [...] All experiments were done on a diverse collection of 13 publicly available datasets. For each dataset, we performed a random split with 80% of the data used for single-pass training and the remaining 20% for testing. We tuned the learning rate, the number of weak learners, and the edge parameter γ (for all but the two versions of Ada Boost.OL) using progressive validation 0-1 loss on the training set. [...] Reported in Table 1 is the 0-1 loss on the test set. |
| Researcher Affiliation | Collaboration | Alina Beygelzimer beygel@yahoo-inc.com Yahoo Research New York, NY 10036 Satyen Kale satyen@yahoo-inc.com Yahoo Research New York, NY 10036 Haipeng Luo haipengl@cs.princeton.edu Princeton University Princeton, NJ 08540 |
| Pseudocode | Yes | Algorithm 1 Online BBM [...] Algorithm 2 Ada Boost.OL |
| Open Source Code | No | The paper states: 'We extended the Vowpal Wabbit open source machine learning system 4 to include the algorithms studied in this paper.' and footnote 4 links to 'https://github.com/John Langford/vowpal wabbit/wiki'. This indicates they used and extended an existing open-source system (VW), but it does not explicitly state that *their specific implementations or extensions* for the algorithms presented in this paper are made open-source with a direct link to *their* code. |
| Open Datasets | No | The paper states 'All experiments were done on a diverse collection of 13 publicly available datasets.' and lists dataset names (e.g., '20news', 'a9a', 'adult'), but it does not provide concrete access information such as URLs, DOIs, specific repository names, or formal citations with authors and years for these datasets. |
| Dataset Splits | Yes | For each dataset, we performed a random split with 80% of the data used for single-pass training and the remaining 20% for testing. We tuned the learning rate, the number of weak learners, and the edge parameter γ (for all but the two versions of Ada Boost.OL) using progressive validation 0-1 loss on the training set. Progressive validation is a standard online validation technique, where each training example is used for testing before it is used for updating the model [Blum et al., 1999]. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. It only mentions 'VW' which is software. |
| Software Dependencies | No | The paper mentions extending 'Vowpal Wabbit open source machine learning system' but does not specify any version number for VW or any other software dependencies. |
| Experiment Setup | Yes | We tuned the learning rate, the number of weak learners, and the edge parameter γ (for all but the two versions of Ada Boost.OL) using progressive validation 0-1 loss on the training set. |