Sample-Efficient Learning of Mixtures

Authors: Hassan Ashtiani, Shai Ben-David, Abbas Mehrabian

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Our framework is algorithmic, its running time is exponential in the number of required samples. Proving sample complexity bounds is important in understanding the statistical nature of various classes of distributions (see, e.g., the recent work of (Diakonikolas, Kane, and Stewart 2017a)), and may pave the way for developing efficient algorithms. In our main result, Theorem 5, assuming the existence of a method for learning F with sample complexity m F(ε), we provide a method for learning Fk with sample complexity O(k log2 k m F(ε)/ε2).
Researcher Affiliation Academia Hassan Ashtiani, Shai Ben-David David R. Cheriton School of Computer Science University of Waterloo 200 University Avenue West Waterloo, Ontario, Canada, N2L 3G1 {mhzokaei, shai}@uwaterloo.ca Abbas Mehrabian Simons Institute for the Theory of Computing 121 Calvin Lab #2190 University of California, Berkeley Berkeley, CA, USA 94720-2190 abbasmehrabian@gmail.com
Pseudocode Yes Figure 1: Algorithm for learning the mixture class Fk
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper is theoretical and focuses on sample complexity bounds for PAC learning distributions rather than conducting empirical experiments with specific datasets. It discusses 'i.i.d. sample generated from an unknown target distribution' conceptually, but no concrete dataset with access information is used or provided.
Dataset Splits No The paper is theoretical and does not present empirical experiments. Therefore, there is no mention of training, validation, or test dataset splits.
Hardware Specification No The paper is theoretical and focuses on sample complexity. It does not describe any empirical experiments that would require specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe empirical experiments or their implementation details. Therefore, no specific software dependencies with version numbers are provided.
Experiment Setup No The paper is theoretical and focuses on sample complexity bounds. It does not present empirical experiments with specific models or training runs. Therefore, no experimental setup details such as hyperparameters or system-level training settings are provided.