Optimality, Accuracy, and Efficiency of an Exact Functional Test
Authors: Hien H. Nguyen, Hua Zhong, Mingzhou Song
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here, we prove the functional optimality of the EFT statistic, demonstrate its advantage in functional inference accuracy over five other methods, and develop a branch-and-bound algorithm with dynamic and quadratic programming to run at orders of magnitude faster than its previous implementation. and We further evaluated the accuracy of EFT in differentiating functional from independent patterns, and it outperformed five other asymmetric methods on simulated patterns with and without noise. and Figure 1 shows the area under the ROC curve (AUROC) and the area under the PR curve (AUPR) as a function of increasing noise levels for each method. and To compare EFT-DQP and the previous EFT quadratic programming (EFT-QP) implementation [Zhong and Song, 2019], we measured their runtime on contingency tables at increasing dimensions and sample sizes (Figure 4). |
| Researcher Affiliation | Academia | Hien H. Nguyen1,2 , Hua Zhong1,3 and Mingzhou Song1 1Department of Computer Science, New Mexico State University, Las Cruces, NM, USA 2Pennsylvania State University, Harrisburg, PA, USA 3Fred Hutchinson Cancer Research Center, Seattle, WA, USA |
| Pseudocode | Yes | Algorithm 1 EFT-DQP(Observed table O) |
| Open Source Code | Yes | Software that implements EFT is freely available in the R package Fun Chisq ( 2.5.0) at https://cran.r-project.org/package=Fun Chisq |
| Open Datasets | No | We generated various contingency tables by a pattern simulator [Sharma et al., 2017]. In a functional pattern, Y functionally depends on X but Y is not a constant function of X; in an independent pattern, X and Y are statistically independent with column (Y) marginal distributions varying from being uniform to non-uniform. We randomly generated 12,000 3x3 functional tables of sample size 100 with uniform row (X) marginal distributions. Then we also generated 12,000 independent tables of sample size 100 at six column marginal extremeness levels (τ=0,1,2,3,4,5). (The paper describes generating its own data using a cited simulator, but does not provide a direct link or public repository for the generated datasets themselves.) |
| Dataset Splits | No | We generated various contingency tables by a pattern simulator [Sharma et al., 2017]. In a functional pattern, Y functionally depends on X but Y is not a constant function of X; in an independent pattern, X and Y are statistically independent with column (Y) marginal distributions varying from being uniform to non-uniform. We randomly generated 12,000 3x3 functional tables of sample size 100 with uniform row (X) marginal distributions. Then we also generated 12,000 independent tables of sample size 100 at six column marginal extremeness levels (τ=0,1,2,3,4,5). (The paper describes data generation and evaluation but does not specify train/validation/test splits, as the methods being evaluated are statistical tests rather than trained models.) |
| Hardware Specification | No | The paper discusses runtime performance but does not specify any hardware details such as CPU, GPU, or memory used for the experiments. |
| Software Dependencies | Yes | Software that implements EFT is freely available in the R package Fun Chisq ( 2.5.0) at https://cran.r-project.org/package=Fun Chisq |
| Experiment Setup | Yes | We randomly generated 12,000 3x3 functional tables of sample size 100 with uniform row (X) marginal distributions. Then we also generated 12,000 independent tables of sample size 100 at six column marginal extremeness levels (τ=0,1,2,3,4,5). The extremeness is controlled by the column sum ratio, set as 1τ:2τ:3τ for 3x3 tables. Column marginals are uniform at τ=0, and become most non-uniform when τ is 5. We evaluate the accuracy of EFT and the five other methods on distinguishing the two pattern types at four noise levels 0, 0.3, 0.6 and 1 using the house noise model [Zhang et al., 2015]. |