reproducibilityindex.ai

Optimality, Accuracy, and Eﬀiciency of an Exact Functional Test

Authors: Hien H. Nguyen, Hua Zhong, Mingzhou Song

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Here, we prove the functional optimality of the EFT statistic, demonstrate its advantage in functional inference accuracy over ﬁve other methods, and develop a branch-and-bound algorithm with dynamic and quadratic programming to run at orders of magnitude faster than its previous implementation. and We further evaluated the accuracy of EFT in differentiating functional from independent patterns, and it outperformed ﬁve other asymmetric methods on simulated patterns with and without noise. and Figure 1 shows the area under the ROC curve (AUROC) and the area under the PR curve (AUPR) as a function of increasing noise levels for each method. and To compare EFT-DQP and the previous EFT quadratic programming (EFT-QP) implementation [Zhong and Song, 2019], we measured their runtime on contingency tables at increasing dimensions and sample sizes (Figure 4).
Researcher Affiliation	Academia	Hien H. Nguyen1,2 , Hua Zhong1,3 and Mingzhou Song1 1Department of Computer Science, New Mexico State University, Las Cruces, NM, USA 2Pennsylvania State University, Harrisburg, PA, USA 3Fred Hutchinson Cancer Research Center, Seattle, WA, USA
Pseudocode	Yes	Algorithm 1 EFT-DQP(Observed table O)
Open Source Code	Yes	Software that implements EFT is freely available in the R package Fun Chisq ( 2.5.0) at https://cran.r-project.org/package=Fun Chisq
Open Datasets	No	We generated various contingency tables by a pattern simulator [Sharma et al., 2017]. In a functional pattern, Y functionally depends on X but Y is not a constant function of X; in an independent pattern, X and Y are statistically independent with column (Y) marginal distributions varying from being uniform to non-uniform. We randomly generated 12,000 3x3 functional tables of sample size 100 with uniform row (X) marginal distributions. Then we also generated 12,000 independent tables of sample size 100 at six column marginal extremeness levels (τ=0,1,2,3,4,5). (The paper describes generating its own data using a cited simulator, but does not provide a direct link or public repository for the generated datasets themselves.)
Dataset Splits	No	We generated various contingency tables by a pattern simulator [Sharma et al., 2017]. In a functional pattern, Y functionally depends on X but Y is not a constant function of X; in an independent pattern, X and Y are statistically independent with column (Y) marginal distributions varying from being uniform to non-uniform. We randomly generated 12,000 3x3 functional tables of sample size 100 with uniform row (X) marginal distributions. Then we also generated 12,000 independent tables of sample size 100 at six column marginal extremeness levels (τ=0,1,2,3,4,5). (The paper describes data generation and evaluation but does not specify train/validation/test splits, as the methods being evaluated are statistical tests rather than trained models.)
Hardware Specification	No	The paper discusses runtime performance but does not specify any hardware details such as CPU, GPU, or memory used for the experiments.
Software Dependencies	Yes	Software that implements EFT is freely available in the R package Fun Chisq ( 2.5.0) at https://cran.r-project.org/package=Fun Chisq
Experiment Setup	Yes	We randomly generated 12,000 3x3 functional tables of sample size 100 with uniform row (X) marginal distributions. Then we also generated 12,000 independent tables of sample size 100 at six column marginal extremeness levels (τ=0,1,2,3,4,5). The extremeness is controlled by the column sum ratio, set as 1τ:2τ:3τ for 3x3 tables. Column marginals are uniform at τ=0, and become most non-uniform when τ is 5. We evaluate the accuracy of EFT and the five other methods on distinguishing the two pattern types at four noise levels 0, 0.3, 0.6 and 1 using the house noise model [Zhang et al., 2015].