AdaBoost is not an Optimal Weak to Strong Learner

Authors: Mikael Møller Høgsgaard, Kasper Green Larsen, Martin Ritzert

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical The main contribution of this work is to show that Ada Boost is not always optimal. Concretely, we show that there exists a weak learner W, such that if Ada Boost is run with W as its weak learner, its sample complexity is sub-optimal by at least one logarithmic factor. This is stated in the following theorem: Theorem 1.1. For any 0 < gamma < C for C > 0 sufficiently small, any d = Omega(ln(1/gamma)), and any exp( exp(Omega(d))) epsilon C, there exists a gamma-weak learner W using a hypothesis set H of VC-dimension d and a distribution D, such that Ada Boost run with W is sub-optimal and needs m Ada(epsilon) = Omega d ln(1/epsilon) samples from D to output with constant probability, a hypothesis with error at most epsilon under D.
Researcher Affiliation Academia Mikael Møller Høgsgaard * 1 Kasper Green Larsen * 1 Martin Ritzert * 2 1Group of Algorithms, Data Structures and Foundations of Machine Learning, Aarhus University, Aarhus, Denmark 2Neural Data Science, Georg-August-Universit at G ottingen, G ottingen, Germany.
Pseudocode Yes Algorithm 1: Ada Boost
Open Source Code No The paper is theoretical and focuses on proofs and analysis; it does not mention or provide any open-source code for the work described.
Open Datasets No The paper is theoretical and discusses concepts like 'training data' and 'data distribution D' in a theoretical context, but it does not mention or provide access information for any specific publicly available datasets used in empirical experiments.
Dataset Splits No The paper is theoretical and does not describe empirical experiments that would involve training, validation, or test dataset splits.
Hardware Specification No The paper is theoretical and does not describe any experimental setup, thus no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not mention any specific software dependencies or their version numbers.
Experiment Setup No The paper is theoretical and focuses on proofs and analysis, so it does not include details about an experimental setup, hyperparameters, or training configurations.