Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Multiclass Boosting and the Cost of Weak Learning
Authors: Nataly Brukhim, Elad Hazan, Shay Moran, Indraneel Mukherjee, Robert E. Schapire
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work we study multiclass boosting with a possibly large number of classes or categories. We find that the boosting algorithm itself only requires O(log k) samples, as we show by analyzing a variant of Ada Boost for our setting. In stark contrast, assuming typical limits on the number of weak-learner calls, we prove that the number of samples required by a weak learner is at least polynomial in k, exponentially more than the number of samples needed by the booster. Alternatively, we prove that the weak learner s accuracy parameter must be smaller than an inverse polynomial in k, showing that the returned weak hypotheses must be nearly the best in their class when k is large. We also prove a trade-off between number of oracle calls and the resources required of the weak learner, meaning that the fewer calls to the weak learner the more that is demanded on each call. |
| Researcher Affiliation | Collaboration | Nataly Brukhim Princeton University Google AI Princeton EMAIL Elad Hazan Princeton University Google AI Princeton EMAIL Shay Moran Technion Google Research EMAIL Indraneel Mukherjee EMAIL Robert E. Schapire Microsoft Research EMAIL |
| Pseudocode | Yes | Algorithm 1 Multiclass Adaboost |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper describes theoretical concepts and calculations of sample complexity (e.g., mw = O(k2)) but does not mention the use of a specific, publicly available dataset or provide access information for one. |
| Dataset Splits | No | The paper does not specify any dataset splits (e.g., percentages or counts for training, validation, or test sets) as it is a theoretical work. |
| Hardware Specification | No | The paper does not provide specific details on hardware used for experiments, as it is a theoretical work and does not describe empirical evaluations requiring such specifications. |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers, as it is a theoretical work and does not detail a software implementation requiring such specifications. |
| Experiment Setup | No | The paper does not provide specific details about an experimental setup, such as hyperparameters or training configurations, as it is a theoretical work. |