Estimating Learnability in the Sublinear Data Regime

Authors: Weihao Kong, Gregory Valiant

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the practical viability of our approaches on synthetic and real data. All experiments were run in Matlab v2016b on a Mac Book Pro laptop, and the code is available from our websites. More details of the experiments are given in the supplementary material.
Researcher Affiliation Academia Weihao Kong Stanford University whkong@stanford.edu Gregory Valiant Stanford University gvaliant@cs.stanford.edu
Pseudocode Yes Algorithm 1 Estimating Linearity, General covariance; Algorithm 2. Estimating Classification Error, General Covariance
Open Source Code Yes All experiments were run in Matlab v2016b on a Mac Book Pro laptop, and the code is available from our websites.
Open Datasets Yes Regression: NLP Experiments. This data is from Kaggle s Wine-Reviews dataset... Binary Classification: MNIST. We also evaluated Algorithm 2 for predicting the classification error on the MNIST dataset.
Dataset Splits No The paper frequently mentions 'test err Bayes-Opt' and 'training err Bayes-Opt' in figures for comparison, and for MNIST, states 'training on 50k datapoints and testing on the remaining datapoints' as part of ground truth calculation. However, it does not explicitly describe a validation split or methodology used for hyperparameter tuning or early stopping for its own models.
Hardware Specification Yes All experiments were run in Matlab v2016b on a Mac Book Pro laptop
Software Dependencies Yes All experiments were run in Matlab v2016b
Experiment Setup Yes Regression: Synthetic Data Experiments. In this experiment, n datapoints x1, . . . , xn Rd are drawn from an multivariate Gaussian, N(0, Σ)... Binary Classification: Synthetic Data Experiments. ... β is a d-dimensional vector with β = 2... Each image is represented as a d = 784 dimensional vector, and the data are 0 centered and scaled so the largest singular value of the sample covariance matrix is 1.