Estimating Learnability in the Sublinear Data Regime
Authors: Weihao Kong, Gregory Valiant
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the practical viability of our approaches on synthetic and real data. All experiments were run in Matlab v2016b on a Mac Book Pro laptop, and the code is available from our websites. More details of the experiments are given in the supplementary material. |
| Researcher Affiliation | Academia | Weihao Kong Stanford University whkong@stanford.edu Gregory Valiant Stanford University gvaliant@cs.stanford.edu |
| Pseudocode | Yes | Algorithm 1 Estimating Linearity, General covariance; Algorithm 2. Estimating Classification Error, General Covariance |
| Open Source Code | Yes | All experiments were run in Matlab v2016b on a Mac Book Pro laptop, and the code is available from our websites. |
| Open Datasets | Yes | Regression: NLP Experiments. This data is from Kaggle s Wine-Reviews dataset... Binary Classification: MNIST. We also evaluated Algorithm 2 for predicting the classification error on the MNIST dataset. |
| Dataset Splits | No | The paper frequently mentions 'test err Bayes-Opt' and 'training err Bayes-Opt' in figures for comparison, and for MNIST, states 'training on 50k datapoints and testing on the remaining datapoints' as part of ground truth calculation. However, it does not explicitly describe a validation split or methodology used for hyperparameter tuning or early stopping for its own models. |
| Hardware Specification | Yes | All experiments were run in Matlab v2016b on a Mac Book Pro laptop |
| Software Dependencies | Yes | All experiments were run in Matlab v2016b |
| Experiment Setup | Yes | Regression: Synthetic Data Experiments. In this experiment, n datapoints x1, . . . , xn Rd are drawn from an multivariate Gaussian, N(0, Σ)... Binary Classification: Synthetic Data Experiments. ... β is a d-dimensional vector with β = 2... Each image is represented as a d = 784 dimensional vector, and the data are 0 centered and scaled so the largest singular value of the sample covariance matrix is 1. |