Nonparametric Independence Testing for Small Sample Sizes
Authors: Aaditya Ramdas, Leila Wehbe
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our main contribution is strong empirical evidence that by employing shrunk operators when the sample size is small, one can attain an improvement in power at low false positive rates. We perform synthetic experiments in a wide variety of settings to demonstrate that the shrunk test statistics achieve higher power than HSIC in a variety of settings. We use two real datasets |
| Researcher Affiliation | Academia | Aaditya Ramdas Dept. of Statistics and Machine Learning Dept. Carnegie Mellon University aramdas@cs.cmu.edu Leila Wehbe Machine Learning Dept. Carnegie Mellon University lwehbe@cs.cmu.edu |
| Pseudocode | No | The paper describes methods using prose and mathematical equations but does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its own source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We use two real datasets the first is the Eckerle dataset [Eckerle, 1979] from the NIST Statistical Reference Datasets (NIST St RD) for Nonlinear Regression, data from a NIST study of circular interference transmittance (n=35, Y is transmittance, X is wavelength). The second is the Aircraft dataset [Bowman and Azzalini, 2014] (n=709, X is log(speed), Y is log(span)). |
| Dataset Splits | No | The paper mentions 'leave-one-out cross-validation (LOOCV)' for parameter estimation, but does not explicitly provide details about training, validation, or testing dataset splits for model evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments. |
| Experiment Setup | No | While the paper describes parameters for statistical tests (e.g., type-1 error α, number of repetitions and permutations) and kernel bandwidth selection, it does not specify typical machine learning experimental setup details such as learning rates, batch sizes, optimizers, or number of epochs. |