The return of AdaBoost.MH: multi-class Hamming trees
Authors: Balázs Kégl
ICLR 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments it is on par with support vector machines and with the best existing multi-class boosting algorithm AOSOLOGITBOOST, and it is significantly better than other known implementations of ADABOOST.MH. |
| Researcher Affiliation | Academia | Bal azs K egl BALAZS.KEGL@GMAIL.COM LAL/LRI, University of Paris-Sud, CNRS, 91898 Orsay, France |
| Pseudocode | Yes | Figure 1. The pseudocode of the ADABOOST.MH algorithm with factorized base classifiers (6). Figure 2. The pseudocode of the Hamming tree base learner. Figure 3. Exhaustive search for the best decision stump. The pseudocode of the algorithm is given in Figure 3. Figure 5 contains the pseudocode of this simple operation. |
| Open Source Code | No | All experiments were done using the open source multiboost software of Benbouzid et al. (2012), version 1.2. |
| Open Datasets | Yes | We carried out experiments on five mid-sized (isolet, letter, optdigits, pendigits, and USPS) and nine small (balance, blood, wdbc, breast, ecoli, iris, pima, sonar, and wine) data sets from the UCI repository. |
| Dataset Splits | Yes | For the small data sets we ran 10 × 10 cross-validation (CV) to optimize the hyperparameters and the estimate the generalization error. For robustly estimating the optimal stopping time we use a smoothed test error. On the medium-size data sets we ran 1 × 5 CV (using the designated test sets where available) following the same procedure. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | Yes | All experiments were done using the open source multiboost software of Benbouzid et al. (2012), version 1.2. |
| Experiment Setup | Yes | For the number of inner nodes we do a grid search...For robustly estimating the optimal stopping time we use a smoothed test error using a linearly growing sliding window, that is, T = arg min T :Tmin<T Tmax t= 0.8T b R(t), (15) where Tmin was set to a constant 50 to avoid stopping too early due to fluctuations. For selecting the best number of inner nodes N, we simply minimized the smoothed test error over a predefined grid N = min N N b R(T N)(N). We chose the best overall hyperparameter J = 20 and ν = 0.1, suggested by the Li (2009a;b). |