Practical Methods for Graph Two-Sample Testing
Authors: Debarghya Ghoshdastidar, Ulrike von Luxburg
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we consider the problem of two-sample testing of large graphs. We demonstrate the practical merits and limitations of existing theoretical tests and their bootstrapped variants. We also propose two new tests based on asymptotic distributions. We show that these tests are computationally less expensive and, in some cases, more reliable than the existing methods. ... We present our numerical results in three groups: (i) results for random graphs for m > 1, (ii) results for random graphs for m = 1, and (iii) results for testing real networks. |
| Researcher Affiliation | Academia | Debarghya Ghoshdastidar Department of Computer Science University of Tübingen ghoshdas@informatik.uni-tuebingen.de Ulrike von Luxburg Department of Computer Science University of Tübingen Max Planck Institute for Intelligent Systems luxburg@informatik.uni-tuebingen.de |
| Pseudocode | No | The paper refers to Algorithm 1 in Tang et al. (2016) but does not provide its own pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our aim is also to make the existing and proposed tests more accessible for applied research. We provide Matlab implementations of the tests in the supplementary material. ... Matlab codes are provided in the supplementary.1 Also available at: https://github.com/gdebarghya/Network-Two Sample Testing. |
| Open Datasets | Yes | In the first setup, we consider moderate sized graphs (n = 178) constructed by thresholding autocorrelation matrices of EEG recordings (Andrzejak et al., 2001, Dua and Taniskidou, 2017). ... We also study networks corresponding to peering information of autonomous systems, that is, graphs defined on the routers comprising the Internet with the edges representing who-talks-to-whom (Leskovec et al., 2005, Leskovec and Krevl, 2014). |
| Dataset Splits | No | The paper does not provide specific details on how datasets are split into training, validation, or test sets for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., CPU, GPU models, memory). |
| Software Dependencies | No | The paper mentions 'Matlab implementations' but does not specify version numbers for Matlab or any other software libraries or dependencies. |
| Experiment Setup | Yes | We let n grow from 100 to 1000 in steps of 100, and set p = 0.1 and q = 0.05. We set ϵ = 0 and 0.04 for null and alternative hypotheses, respectively. We use two values of population size, m {2, 4}, and fix the significance level at α = 5%. ... We use r {2, 4} to specify the rank parameter for bootstrap tests, and also as the number of blocks used for community detection step of Asymp-TW. |