Label Efficient Learning by Exploiting Multi-Class Output Codes

Authors: Maria Balcan, Travis Dick, Yishay Mansour

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We present a new perspective on the popular multi-class algorithmic techniques of one-vs-all and error correcting output codes. Rather than studying the behavior of these techniques for supervised learning, we establish a connection between the success of these methods and the existence of labelefficient learning procedures. We show that in both the realizable and agnostic cases, if output codes are successful at learning from labeled data, they implicitly assume structure on how the classes are related. By making that structure explicit, we design learning algorithms to recover the classes with low label complexity. We provide results for the commonly studied cases of one-vs-all learning and when the codewords of the classes are well separated. We additionally consider the more challenging case where the codewords are not well separated, but satisfy a boundary features condition that captures the natural intuition that every bit of the codewords should be significant. Full proofs of all our results can be found in the full version of the paper (Balcan, Dick, and Mansour 2016), while we present the modeling and technical flow of ideas here.
Researcher Affiliation Collaboration Maria Florina Balcan School of Computer Science Carnegie Mellon University ninamf@cs.cmu.edu Travis Dick School of Computer Science Carnegie Mellon University tdick@cs.cmu.edu Yishay Mansour Microsoft Research and Tel Aviv University mansour@tau.ac.il
Pseudocode Yes Algorithm 1: Single-linkage learning. Algorithm 2: Hierarchical single-linkage learning. Algorithm 3: Robust single-linkage learning. Algorithm 4: Plane-detection algorithm.
Open Source Code No The paper does not provide any explicit statements about releasing source code for the methodology described, nor does it include links to a code repository.
Open Datasets No The paper does not mention the use of specific, named datasets for training or provide any information about their public availability or source. It discusses abstract data distributions like P on X Rd.
Dataset Splits No As a theoretical paper, it does not describe specific training, validation, or test dataset splits or their sizes. It discusses theoretical sample complexities.
Hardware Specification No The paper is theoretical and does not describe any specific hardware components (e.g., GPU/CPU models, memory specifications) used for running experiments.
Software Dependencies No The paper is theoretical and does not specify any software dependencies or their version numbers, such as programming languages, libraries, or frameworks.
Experiment Setup No The paper is theoretical and does not provide specific experimental setup details such as hyperparameter values, training schedules, or system-level configurations.