Kernelized Stein Discrepancy Tests of Goodness-of-fit for Time-to-Event Data
Authors: Tamara Fernandez, Nicolas Rivera, Wenkai Xu, Arthur Gretton
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies and results are presented in Section 6, where we compare with a recent state-of-the-art non-parametric test for censored data by Fernandez & Gretton (2019) based on the MMD, which has been shown to outperform classical tests. Our experimental results show that our proposed methods perform better than existing tests, including previous tests based on a kernelized maximum mean discrepancy. |
| Researcher Affiliation | Academia | 1Gatsby Computational Neuroscience Unit, University College London, United Kingdom 2Department of Computer Science and Technology, University of Cambridge, United Kingdom. |
| Pseudocode | No | No pseudocode or algorithm blocks are present in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code related to the described methodology. |
| Open Datasets | Yes | aml: Acute Myelogenous Leukemia survival dataset (Miller Jr, 2011); cgd: Chronic Granulotamous Disease dataset (Fleming & Harrington, 2011); ovarian: Ovarian Cancer Survival dataset (Edmonson et al., 1979); lung: North Central Can-p-value aml cgd ovarian Exponential 0.585 0.460 0.681 Weibull: shape=2 0.001 0.002 0.063 Table 1. Real data applications on testing hazard proportionality. Dataset Covarites p-value lung Age 0.167 stanford T5 mismatch score 0.594 nafld Weight and Gender 0.108 Table 2. Real data applications on testing goodness of fit cer Treatment Group (NCCTG) Lung Cancer dataset (Loprinzi et al., 1994); stanford: Stanford Heart Transplant Data (Crowley & Hu, 1977); nafld: Non-alcohol fatty liver disease (NAFLD) (Allen et al., 2018). |
| Dataset Splits | No | The paper mentions 'spliting the data into training set and test sets' but does not specify a validation set or detailed split percentages for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers needed to replicate the experiment. |
| Experiment Setup | Yes | In all our experiments we choose the null as an exponential distribution of rate 1, and in this case we can check that s KSD and m KSD coincide. Additionally, we implement m KSDu, which is given by the test m KSD applied to the transformed data ((F0(Ti), i))n i=1 to test H0 : F0(X) U(0, 1). Finally, we use an Gaussian kernel with length-scale chosen by using the median-heuristic, which is the median of all the absolute differences between two different data points. |