Nonparametric Independence Testing for Small Sample Sizes

Authors: Aaditya Ramdas, Leila Wehbe

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our main contribution is strong empirical evidence that by employing shrunk operators when the sample size is small, one can attain an improvement in power at low false positive rates. We perform synthetic experiments in a wide variety of settings to demonstrate that the shrunk test statistics achieve higher power than HSIC in a variety of settings. We use two real datasets
Researcher Affiliation Academia Aaditya Ramdas Dept. of Statistics and Machine Learning Dept. Carnegie Mellon University aramdas@cs.cmu.edu Leila Wehbe Machine Learning Dept. Carnegie Mellon University lwehbe@cs.cmu.edu
Pseudocode No The paper describes methods using prose and mathematical equations but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing its own source code or a link to a code repository for the methodology described.
Open Datasets Yes We use two real datasets the first is the Eckerle dataset [Eckerle, 1979] from the NIST Statistical Reference Datasets (NIST St RD) for Nonlinear Regression, data from a NIST study of circular interference transmittance (n=35, Y is transmittance, X is wavelength). The second is the Aircraft dataset [Bowman and Azzalini, 2014] (n=709, X is log(speed), Y is log(span)).
Dataset Splits No The paper mentions 'leave-one-out cross-validation (LOOCV)' for parameter estimation, but does not explicitly provide details about training, validation, or testing dataset splits for model evaluation.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments.
Experiment Setup No While the paper describes parameters for statistical tests (e.g., type-1 error α, number of repetitions and permutations) and kernel bandwidth selection, it does not specify typical machine learning experimental setup details such as learning rates, batch sizes, optimizers, or number of epochs.