Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multiclass Boosting and the Cost of Weak Learning

Authors: Nataly Brukhim, Elad Hazan, Shay Moran, Indraneel Mukherjee, Robert E. Schapire

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work we study multiclass boosting with a possibly large number of classes or categories. We find that the boosting algorithm itself only requires O(log k) samples, as we show by analyzing a variant of Ada Boost for our setting. In stark contrast, assuming typical limits on the number of weak-learner calls, we prove that the number of samples required by a weak learner is at least polynomial in k, exponentially more than the number of samples needed by the booster. Alternatively, we prove that the weak learner s accuracy parameter must be smaller than an inverse polynomial in k, showing that the returned weak hypotheses must be nearly the best in their class when k is large. We also prove a trade-off between number of oracle calls and the resources required of the weak learner, meaning that the fewer calls to the weak learner the more that is demanded on each call.
Researcher Affiliation Collaboration Nataly Brukhim Princeton University Google AI Princeton EMAIL Elad Hazan Princeton University Google AI Princeton EMAIL Shay Moran Technion Google Research EMAIL Indraneel Mukherjee EMAIL Robert E. Schapire Microsoft Research EMAIL
Pseudocode Yes Algorithm 1 Multiclass Adaboost
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper describes theoretical concepts and calculations of sample complexity (e.g., mw = O(k2)) but does not mention the use of a specific, publicly available dataset or provide access information for one.
Dataset Splits No The paper does not specify any dataset splits (e.g., percentages or counts for training, validation, or test sets) as it is a theoretical work.
Hardware Specification No The paper does not provide specific details on hardware used for experiments, as it is a theoretical work and does not describe empirical evaluations requiring such specifications.
Software Dependencies No The paper does not specify software dependencies with version numbers, as it is a theoretical work and does not detail a software implementation requiring such specifications.
Experiment Setup No The paper does not provide specific details about an experimental setup, such as hyperparameters or training configurations, as it is a theoretical work.