reproducibilityindex.ai

A Powerful Global Test Statistic for Functional Statistical Inference

Authors: Jingwen Zhang, Joseph Ibrahim, Tengfei Li, Hongtu Zhu5765-5772

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use simulations to show that the proposed test outperforms all existing state-of-the-art methods in functional statistical inference. Finally, we apply the proposed testing method to the genome-wide association analysis of imaging genetic data in UK Biobank dataset.
Researcher Affiliation	Academia	1Department of Biostatistics, UNC Gillings School of Global Public Health 2Biomedical Research Imaging Center, UNC School of Medicine University of North Carolina at Chapel Hill
Pseudocode	Yes	Algorithm 1. (a) Fit the varying coefﬁcient model under the null hypothesis and get the estimate of bβ0(s) and {bηi,0(s), bei,0(s)}n i=1. (b) For g = 1, , G, generate independent random numbers ν(g) i and ν(g) i (sm) from N(0, 1), and the wild bootstrap sample on each grid point can be calculated as by(g) i (sm) = bβ0(sm)T xi+ν(g) i bηi,0(sm)+ν(g) i (sm)bei,0(sm). (c) Repeat the testing procedure and obtain G samples of Tn(bω(g) λ ) under the null hypothesis. (d) The p-value is approximated by p = G 1 P G g=1 I{Tn( bwλ) Tn(bω(g) λ )}.
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code of the described methodology. A link is provided for the proof of a theorem, not the code.
Open Datasets	Yes	Finally, we apply the proposed testing method to the genome-wide association analysis of imaging genetic data in UK Biobank dataset.
Dataset Splits	No	The paper mentions total sample sizes and the number of simulation replicates, but it does not explicitly specify training, validation, or test dataset splits using percentages, sample counts, or references to predefined splits.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions the "FSL tool set (Jenkinson et al. 2012)" but does not provide specific version numbers for this or any other software dependencies, which are necessary for reproducibility.
Experiment Setup	Yes	We set n = 200 and S = 1 and put the number of grid points M = 100 evenly in [0, 1]. For the choice of the tuning parameter, we considered both ﬁxed quantities where log λn takes values from [ 2, 0] with an equal increment of 0.1 (PFGT-λn) and an optimal bλn selected by (19) in each run (PFGT-optimal). In each scenario, 1,000 simulation replicates were generated to evaluate type I and type II error rates respectively. To calculate p-values, G = 1, 000 wild-bootstrap samples were generated in each run. We ﬁtted model (1) with covariates including an intercept term, a speciﬁc SNP, age, gender, and the top 5 genetic principal components. For each MAF category, we generated 10,000 bootstrap samples and adopted a mixed chis-quare approximation (Zhang 2005) to approximate the null distribution of the test statistic.