Training-Time Optimization of a Budgeted Booster
Authors: Yi Huang, Brian Powers, Lev Reyzin
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally show that our method improves upon the boosting approach Ada Boost RS [Reyzin, 2011] and in many cases also outperforms the recent algorithm Speed Boost [Grubb and Bagnell, 2012]. We provide a theoretical justication for our optimization method via the margin bound. We also experimentally show that our method outperforms pruned decision trees, a natural budgeted classifier. and 6 Experimental results Although there are a number of feature-efficient classification methods [Gao and Koller, 2011; Schwing et al., 2011; Xu et al., 2012], we directly compare the performance of Ada Boost BT, Ada Boost BT Greedy and Ada Boost BT Smoothed to Ada Boost RS and Speed Boost as both are feature-efficient boosting methods which allow for any class of weak learners. For our experiments we first used datasets from the UCI repository, as shown in Table 1. |
| Researcher Affiliation | Academia | Yi Huang, Brian Powers, Lev Reyzin Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago Chicago, IL 60607 {yhuang,bpower6,lreyzin}@math.uic.edu |
| Pseudocode | Yes | Algorithm 1 Ada Boost BT(S,B,C), where: S X { 1, +1}, B > 0, C : [i . . . n] R+ |
| Open Source Code | No | The paper does not provide any statement or link regarding the public availability of its source code. |
| Open Datasets | Yes | For our experiments we first used datasets from the UCI repository, as shown in Table 1. ... Then, to study our algorithms on real-world data, we used the Yahoo! Webscope dataset 2, which includes feature costs [Xu et al., 2012]. |
| Dataset Splits | No | Table 1 provides 'training size' and 'test size' for the datasets, and figures mention 'test error', but there is no explicit mention of a 'validation' split or its size/methodology. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory specifications). |
| Software Dependencies | No | The paper mentions algorithms and components like 'Ada Boost' and 'decision stumps' and 'exponential loss', but does not provide specific version numbers for any software dependencies or libraries used. |
| Experiment Setup | Yes | For our experiments we first used datasets from the UCI repository, as shown in Table 1. ... Ada Boost was run for a number of rounds that gave lowest test error, irrespective of budget. This setup was chosen to compare directly against the results of Reyzin [2011] who also used random costs. ... Features are given costs uniformly at random on the interval [0, 2]. ... The data set contains 519 features, whose costs we rescaled to costs to the set {.1, .5, 1, 2, 5, 10, 15, 20}. |