Estimating decision tree learnability with polylogarithmic sample complexity

Authors: Guy Blanc, Neha Gupta, Jane Lange, Li-Yang Tan

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We show that top-down decision tree learning heuristics (such as ID3, C4.5, and CART) are amenable to highly efficient learnability estimation: for monotone target functions, the error of the decision tree hypothesis constructed by these heuristics can be estimated with polylogarithmically many labeled examples, exponentially smaller than the number necessary to run these heuristics, and indeed, exponentially smaller than information-theoretic minimum required to learn a good decision tree. This adds to a small but growing list of fundamental learning algorithms that have been shown to be amenable to learnability estimation. En route to this result, we design and analyze sample-efficient minibatch versions of top-down decision tree learning heuristics and show that they achieve the same provable guarantees as the full-batch versions. We further give active local versions of these heuristics: given a test point x , we show how the label T(x ) of the decision tree hypothesis T can be computed with polylogarithmically many labeled examples, exponentially smaller than the number necessary to learn T.
Researcher Affiliation Academia Guy Blanc Stanford University gblanc@cs.stanford.edu Neha Gupta Stanford University nehagupta@cs.stanford.edu Jane Lange Massachusetts Institute of Technology jlange@mit.edu Li-Yang Tan Stanford University liyang@cs.stanford.edu
Pseudocode Yes Figure 1: MINIBATCHTOPDOWNG takes as input a size parameter t, a minibatch size b, and a labeled dataset S. It outputs a size-t decision tree hypothesis for f. Figure 2: LOCALLEARNERG takes as input a size parameter t, a minibatch size b, an unlabeled dataset S , and an input x . It selectively queries f s values on a few points within S and outputs T(x ), where T is a tree of size approximately t that MINIBATCHTOPDOWNG would return if we were to label all of S and train MINIBATCHTOPDOWNG on it.
Open Source Code No The paper focuses on theoretical contributions and does not mention releasing any source code for the described methodologies.
Open Datasets No The paper describes the theoretical setting for training data (e.g., 'n labeled training examples (x, f(x)) where x { 1}d is uniform random') but does not specify or provide access to any actual public dataset used for training.
Dataset Splits No This is a theoretical paper and does not specify dataset splits (training, validation, test) for empirical evaluation.
Hardware Specification No This is a theoretical paper and does not describe any experimental setup or hardware used for running experiments.
Software Dependencies No This is a theoretical paper focusing on mathematical analysis and algorithm design principles, and thus does not list specific software dependencies with version numbers.
Experiment Setup No This is a theoretical paper and does not include details about an experimental setup, such as hyperparameters or training configurations.