Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Network two-sample test for block models

Authors: Chung Kyong Nguen, Arash Amini, OSCAR HERNAN MADRID PADILLA

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through a mixture of theoretical insights and empirical validations, including experiments with both synthetic and real-world data, this study advances robust statistical inference for complex network data. ... In this section, we provide experimental results on real and simulated networks and compare our proposed test to two existing approaches.
Researcher Affiliation	Academia	Department of Statistics University of California, Los Angeles California, CA 90024 EMAIL
Pseudocode	Yes	Algorithm 1 Spectral Matching: M( b B1 b B2) ... Algorithm 2 SBM Two-Sample Test (SBM-TS)
Open Source Code	Yes	The code for reproducing all experiments in this section is publicly available at https://github.com/aaamini/sbm-ts.
Open Datasets	Yes	The COLLAB dataset is a scientific collaboration dataset first introduced in [40]. ... The Star Wars (SW) Game of Thrones (GOT) dataset [34] is derived from popular films and television series. ... Table 2: Average degree vs. size for subsets of the Amazon Computers dataset.
Dataset Splits	No	The paper describes how synthetic data is generated for experiments, setting parameters like "sample sizes to Nr = 100, the number of vertices to nrt = 10000, the noise level to ε = 0.05, and sparsity factor to ρ = 0.1". For real-world data, it mentions "N1 = N2 = m; the two samples under null will be drawn at random (without replacement) both from class Ci, and under the alternative from classes Ci and Cj for i = j." and "randomly subsamples nodes" for analyzing subgraphs. However, it does not specify explicit train/test/validation splits for machine learning tasks.
Hardware Specification	Yes	All experiments were conducted on an internal computing cluster with Intel(R) Xeon(R) Platinum 8160 CPUs (48 cores, 2.10GHz).
Software Dependencies	No	The paper mentions "Py Torch Geometric library" but does not provide specific version numbers for it or any other key software components used in the experiments.
Experiment Setup	Yes	We generate 50 instances of the testing problem B1 = B2 = ρB(i) versus (B1, B2) = (ρB(i), ρB(i) ε ), where B(i) is a symmetric matrix whose entries are drawn from U(0.2, 0.7) and B(i) ε = B(i) + M (i) ε such that M (i) ε has entries N(0, ε2) subject to symmetry. We set the sample sizes to Nr = 100, the number of vertices to nrt = 10000, the noise level to ε = 0.05, and sparsity factor to ρ = 0.1.