reproducibilityindex.ai

Practical Methods for Graph Two-Sample Testing

Authors: Debarghya Ghoshdastidar, Ulrike von Luxburg

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we consider the problem of two-sample testing of large graphs. We demonstrate the practical merits and limitations of existing theoretical tests and their bootstrapped variants. We also propose two new tests based on asymptotic distributions. We show that these tests are computationally less expensive and, in some cases, more reliable than the existing methods. ... We present our numerical results in three groups: (i) results for random graphs for m > 1, (ii) results for random graphs for m = 1, and (iii) results for testing real networks.
Researcher Affiliation	Academia	Debarghya Ghoshdastidar Department of Computer Science University of Tübingen ghoshdas@informatik.uni-tuebingen.de Ulrike von Luxburg Department of Computer Science University of Tübingen Max Planck Institute for Intelligent Systems luxburg@informatik.uni-tuebingen.de
Pseudocode	No	The paper refers to Algorithm 1 in Tang et al. (2016) but does not provide its own pseudocode or algorithm blocks.
Open Source Code	Yes	Our aim is also to make the existing and proposed tests more accessible for applied research. We provide Matlab implementations of the tests in the supplementary material. ... Matlab codes are provided in the supplementary.1 Also available at: https://github.com/gdebarghya/Network-Two Sample Testing.
Open Datasets	Yes	In the first setup, we consider moderate sized graphs (n = 178) constructed by thresholding autocorrelation matrices of EEG recordings (Andrzejak et al., 2001, Dua and Taniskidou, 2017). ... We also study networks corresponding to peering information of autonomous systems, that is, graphs deﬁned on the routers comprising the Internet with the edges representing who-talks-to-whom (Leskovec et al., 2005, Leskovec and Krevl, 2014).
Dataset Splits	No	The paper does not provide specific details on how datasets are split into training, validation, or test sets for reproducibility.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments (e.g., CPU, GPU models, memory).
Software Dependencies	No	The paper mentions 'Matlab implementations' but does not specify version numbers for Matlab or any other software libraries or dependencies.
Experiment Setup	Yes	We let n grow from 100 to 1000 in steps of 100, and set p = 0.1 and q = 0.05. We set ϵ = 0 and 0.04 for null and alternative hypotheses, respectively. We use two values of population size, m {2, 4}, and ﬁx the signiﬁcance level at α = 5%. ... We use r {2, 4} to specify the rank parameter for bootstrap tests, and also as the number of blocks used for community detection step of Asymp-TW.