Statistical Model Criticism using Kernel Two Sample Tests

Authors: James R. Lloyd, Zoubin Ghahramani

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate on synthetic data that the selected statistic, called the witness function, can be used to identify where a statistical model most misrepresents the data it was trained on. We then apply the procedure to real data where the models being assessed are restricted Boltzmann machines, deep belief networks and Gaussian process regression and demonstrate the ways in which these models fail to capture the properties of the data they are trained on.
Researcher Affiliation Academia James Robert Lloyd Department of Engineering University of Cambridge Zoubin Ghahramani Department of Engineering University of Cambridge
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes For details see code at [redacted]
Open Datasets Yes We demonstrate MMD model criticism on toy examples, restricted Boltzmann machines and deep belief networks trained on MNIST digits and Gaussian process regression models trained on several time series. Our proposed method identifies discrepancies between the data and fitted models that would not be apparent from predictive performance focused metrics.
Dataset Splits Yes We use a radial basis function kernel and select the lengthscale by 5 fold cross validation using predictive likelihood of the kernel density estimate as the selection criterion. To construct p-values we use held out data using the same split of training and testing data as the interpolation experiment in [7].
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided in the paper.
Software Dependencies No The paper refers to using settings from a deep learning tutorial [25] but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We trained an RBM with architecture (784) (500) (10) using 15 epochs of persistent contrastive divergence (PCD-15), a batch size of 20 and a learning rate of 0.1 (i.e. we used the same settings as the code available at the deep learning tutorial [25]).