On the Recursive Teaching Dimension of VC Classes

Authors: Xi Chen, Xi Chen, Yu Cheng, Bo Tang

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we study the quantitative relation between RTD and the well-known learning complexity measure VC dimension (VCD), and improve the best known upper and (worst-case) lower bounds on the recursive teaching dimension with respect to the VC dimension. Given a concept class C {0, 1}n with VCD(C) = d, we first show that RTD(C) is at most d 2d+1. This is the first upper bound for RTD(C) that depends only on VCD(C), independent of the size of the concept class |C| and its domain size n. Before our work, the best known upper bound for RTD(C) is O(d2d log log |C|), obtained by Moran et al. [MSWY15]. We remove the log log |C| factor. We also improve the lower bound on the worst-case ratio of RTD(C) to VCD(C). We present a family of classes {Ck}k 1 with VCD(Ck) = 3k and RTD(Ck) = 5k, which implies that the ratio of RTD(C) to VCD(C) in the worst case can be as large as 5/3.
Researcher Affiliation Academia Xi Chen Department of Computer Science Columbia University xichen@cs.columbia.edu Yu Cheng Department of Computer Science University of Southern California yu.cheng.1@usc.edu Bo Tang Department of Computer Science Oxford University tangbonk1@gmail.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper mentions that 'We also include a text file with the entire concept class C0 (as a 100 × 12 matrix) in the supplemental material.' This refers to a data file, not the source code for the methodology or proofs. While it mentions using SAT solvers like Lingeling and Glucose, it only provides general links to their public websites, not the authors' specific implementation code.
Open Datasets No The paper is theoretical and does not use or reference publicly available datasets for training or evaluation. It constructs concept classes (e.g., C0) for theoretical demonstrations, which are provided as a text file in supplemental material, but these are not publicly available datasets in the typical sense for machine learning experiments.
Dataset Splits No The paper is theoretical and does not involve empirical experiments with data, therefore, it does not describe dataset splits (training, validation, test).
Hardware Specification No The paper mentions using state-of-the-art SAT solvers (Lingeling and Glucose) but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for their computations.
Software Dependencies Yes The SAT solvers we use are Lingeling [Bie15] and Glucose [AS14] (based on Mini SAT [ES03]). Glucose 4.0. 2014. Available at http://www.labri.fr/perso/lsimon/glucose.
Experiment Setup No The paper is theoretical and does not describe an experimental setup including hyperparameters, model initialization, or training schedules. It focuses on mathematical proofs and the formulation of a SAT problem to find theoretical constructs.