reproducibilityindex.ai

A Kernel-based Test of Independence for Cluster-correlated Data

Authors: Hongjiao Liu, Anna Plantinga, Yunhua Xiang, Michael Wu

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Based on both simulation studies and real data analysis, we show that, with clustered data, our approach effectively controls type I error and has a higher statistical power than competing methods.
Researcher Affiliation	Academia	Hongjiao Liu Department of Biostatistics University of Washington liuhj@uw.edu Anna M. Plantinga Department of Mathematics and Statistics Williams College amp9@williams.edu Yunhua Xiang Department of Biostatistics University of Washington xiangyh@uw.edu Michael C. Wu Public Health Sciences Division Fred Hutchinson Cancer Research Center mcwu@fredhutch.org
Pseudocode	No	The paper does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	All of our codes are implemented in R, and are available at https://github.com/Liujiao92/HSICcl.
Open Datasets	Yes	Here we apply HSICcl and competing methods to test the dependence between the overall vaginal microbiome composition and different metabolic pathways, using data from the Menopause Strategies: Finding Lasting Answers for Symptoms and Health (Ms FLASH) Vaginal Health Trial [27].
Dataset Splits	No	The paper does not specify explicit training, validation, or test dataset splits. It mentions using m clusters and d time points for simulations and real data analysis, but no partitioning for model training/validation/testing.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper states 'All of our codes are implemented in R' but does not specify the version of R or any specific R libraries with their version numbers.
Experiment Setup	Yes	For both X and Y , we consider two different kernels: the Gaussian kernel k X(z1, z2) = k Y (z1, z2) = exp( z1 z2 2 2/τ) and the linear kernel k X(z1, z2) = k Y (z1, z2) = z T 1 z2. For the Gaussian kernel, the shape parameter τ is chosen as the median of the Euclidean distance between each sample pair.