A Kernel-based Test of Independence for Cluster-correlated Data
Authors: Hongjiao Liu, Anna Plantinga, Yunhua Xiang, Michael Wu
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Based on both simulation studies and real data analysis, we show that, with clustered data, our approach effectively controls type I error and has a higher statistical power than competing methods. |
| Researcher Affiliation | Academia | Hongjiao Liu Department of Biostatistics University of Washington liuhj@uw.edu Anna M. Plantinga Department of Mathematics and Statistics Williams College amp9@williams.edu Yunhua Xiang Department of Biostatistics University of Washington xiangyh@uw.edu Michael C. Wu Public Health Sciences Division Fred Hutchinson Cancer Research Center mcwu@fredhutch.org |
| Pseudocode | No | The paper does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | All of our codes are implemented in R, and are available at https://github.com/Liujiao92/HSICcl. |
| Open Datasets | Yes | Here we apply HSICcl and competing methods to test the dependence between the overall vaginal microbiome composition and different metabolic pathways, using data from the Menopause Strategies: Finding Lasting Answers for Symptoms and Health (Ms FLASH) Vaginal Health Trial [27]. |
| Dataset Splits | No | The paper does not specify explicit training, validation, or test dataset splits. It mentions using m clusters and d time points for simulations and real data analysis, but no partitioning for model training/validation/testing. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states 'All of our codes are implemented in R' but does not specify the version of R or any specific R libraries with their version numbers. |
| Experiment Setup | Yes | For both X and Y , we consider two different kernels: the Gaussian kernel k X(z1, z2) = k Y (z1, z2) = exp( z1 z2 2 2/τ) and the linear kernel k X(z1, z2) = k Y (z1, z2) = z T 1 z2. For the Gaussian kernel, the shape parameter τ is chosen as the median of the Euclidean distance between each sample pair. |