reproducibilityindex.ai

Statistical Model Criticism using Kernel Two Sample Tests

Authors: James R. Lloyd, Zoubin Ghahramani

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate on synthetic data that the selected statistic, called the witness function, can be used to identify where a statistical model most misrepresents the data it was trained on. We then apply the procedure to real data where the models being assessed are restricted Boltzmann machines, deep belief networks and Gaussian process regression and demonstrate the ways in which these models fail to capture the properties of the data they are trained on.
Researcher Affiliation	Academia	James Robert Lloyd Department of Engineering University of Cambridge Zoubin Ghahramani Department of Engineering University of Cambridge
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	For details see code at [redacted]
Open Datasets	Yes	We demonstrate MMD model criticism on toy examples, restricted Boltzmann machines and deep belief networks trained on MNIST digits and Gaussian process regression models trained on several time series. Our proposed method identiﬁes discrepancies between the data and ﬁtted models that would not be apparent from predictive performance focused metrics.
Dataset Splits	Yes	We use a radial basis function kernel and select the lengthscale by 5 fold cross validation using predictive likelihood of the kernel density estimate as the selection criterion. To construct p-values we use held out data using the same split of training and testing data as the interpolation experiment in [7].
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided in the paper.
Software Dependencies	No	The paper refers to using settings from a deep learning tutorial [25] but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We trained an RBM with architecture (784) (500) (10) using 15 epochs of persistent contrastive divergence (PCD-15), a batch size of 20 and a learning rate of 0.1 (i.e. we used the same settings as the code available at the deep learning tutorial [25]).