Interpretable Distribution Features with Maximum Testing Power

Authors: Wittawat Jitkrittum, Zoltán Szabó, Kacper P. Chwialkowski, Arthur Gretton

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we demonstrate the effectiveness of the proposed methods on both toy and real problems. The results are shown in Fig. 2 where type-I error (for SG problem), and test power (for GMD, GVD and Blobs problems) are plotted against test sample size.
Researcher Affiliation Academia Wittawat Jitkrittum, Zoltán Szabó, Kacper Chwialkowski, Arthur Gretton wittawatj@gmail.com zoltan.szabo.m@gmail.com kacper.chwialkowski@gmail.com arthur.gretton@gmail.com Gatsby Unit, University College London
Pseudocode No The paper describes methods and procedures in text but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes All the code and preprocessed data are available at https://github.com/wittawatj/interpretable-test.
Open Datasets Yes We first consider the problem of distinguishing two categories of publications at the conference on Neural Information Processing Systems (NIPS)... Each paper is represented as a bag of words using TF-IDF (Manning et al., 2008) as features. We use Karolinska Directed Emotional Faces (KDEF) dataset (Lundqvist et al., 1998)...
Dataset Splits Yes To avoid creating a dependency between θ and the data used for testing (which would affect the null distribution), we split the data into two disjoint sets. Let D := (X, Y) and Dtr, Dte D such that Dtr Dte = and Dtr Dte = D. In practice, since µ and Σ are unknown, we use ˆλtr n/2 in place of λn, where ˆλtr n/2 is the test statistic computed on the training set Dtr. For simplicity, we assume that each of Dtr and Dte has half of the samples in D.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies No The paper mentions general tools and libraries but does not provide specific software names with version numbers (e.g., Python 3.x, PyTorch x.x) used for the implementation.
Experiment Setup Yes We set α = 0.01 in all the experiments. J is pre-specified and fixed. For the ME test, we initialize the test locations with realizations from two multivariate normal distributions fitted to samples from P and Q; this ensures that the initial locations are well supported by the data. For the SCF test, initialization using the standard normal distribution is found to be sufficient. The parameter γn is not optimized; we set the regularization parameter γn to be as small as possible while being large enough to ensure that (Sn + γn I) 1 can be stably computed.