The return of AdaBoost.MH: multi-class Hamming trees

Authors: Balázs Kégl

ICLR 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments it is on par with support vector machines and with the best existing multi-class boosting algorithm AOSOLOGITBOOST, and it is significantly better than other known implementations of ADABOOST.MH.
Researcher Affiliation Academia Bal azs K egl BALAZS.KEGL@GMAIL.COM LAL/LRI, University of Paris-Sud, CNRS, 91898 Orsay, France
Pseudocode Yes Figure 1. The pseudocode of the ADABOOST.MH algorithm with factorized base classifiers (6). Figure 2. The pseudocode of the Hamming tree base learner. Figure 3. Exhaustive search for the best decision stump. The pseudocode of the algorithm is given in Figure 3. Figure 5 contains the pseudocode of this simple operation.
Open Source Code No All experiments were done using the open source multiboost software of Benbouzid et al. (2012), version 1.2.
Open Datasets Yes We carried out experiments on five mid-sized (isolet, letter, optdigits, pendigits, and USPS) and nine small (balance, blood, wdbc, breast, ecoli, iris, pima, sonar, and wine) data sets from the UCI repository.
Dataset Splits Yes For the small data sets we ran 10 × 10 cross-validation (CV) to optimize the hyperparameters and the estimate the generalization error. For robustly estimating the optimal stopping time we use a smoothed test error. On the medium-size data sets we ran 1 × 5 CV (using the designated test sets where available) following the same procedure.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies Yes All experiments were done using the open source multiboost software of Benbouzid et al. (2012), version 1.2.
Experiment Setup Yes For the number of inner nodes we do a grid search...For robustly estimating the optimal stopping time we use a smoothed test error using a linearly growing sliding window, that is, T = arg min T :Tmin<T Tmax t= 0.8T b R(t), (15) where Tmin was set to a constant 50 to avoid stopping too early due to fluctuations. For selecting the best number of inner nodes N, we simply minimized the smoothed test error over a predefined grid N = min N N b R(T N)(N). We chose the best overall hyperparameter J = 20 and ν = 0.1, suggested by the Li (2009a;b).