Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Low-Rank Graphon Learning for Networks

Authors: Xinyuan Fan, Feiyan Ma, Chenlei Leng, Weichi Wu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We establish consistency and demonstrate strong empirical performance in terms of computational efﬁciency and estimation accuracy through simulations and data analysis.
Researcher Affiliation	Academia	Xinyuan Fan Department of Statistics and Data Science Tsinghua University Beijing, China EMAIL Feiyan Ma Weiyang College Tsinghua University Beijing, China EMAIL Chenlei Leng Department of Applied Mathematics Hong Kong Polytechnic University Hong Kong, China EMAIL Weichi Wu Department of Statistics and Data Science Tsinghua University Beijing, China EMAIL
Pseudocode	Yes	Algorithm 1 Estimation for {pij}n i,j=1 in Rank-1 Model. Algorithm 2 Estimation for {pij}n i,j=1 in Rank-r Model. Algorithm 3 Fast Estimation Procedure for {pij}n i,j=1 in Rank-r Model. Algorithm 4 Procedure for selecting r. Algorithm 5 Iterative Power Iteration.
Open Source Code	Yes	The code is available at https://github.com/Chiyuru/Low-Rank-Graphon-Learning-for-Networks.
Open Datasets	Yes	The datasets used are publicly available and cited appropriately. We consider the U.S. Political Blog Dataset [Adamic and Glance, 2005], which consists of 1490 nodes. Firstly, we applied our method to real contact data from a primary school, collected by the Socio Patterns project3 using active RFID devices that recorded data every 20 seconds. (http://www.sociopatterns.org)
Dataset Splits	No	The paper generates synthetic networks for simulations and applies the method to real-world datasets. For the real-world datasets, it describes analyzing the networks or subsampling (e.g., removing 10% of nodes for a specific experiment), but does not provide standard training/test/validation splits for model evaluation.
Hardware Specification	Yes	Experiments were conducted on an Apple M1 machine with 16GB RAM, mac OS Sonoma, and R 4.2.1.
Software Dependencies	No	The paper mentions "R 4.2.1" as the operating environment. It also lists several comparison methods (USVT, SAS, Nethist, N.S., P.I.) and states they are "implemented using the R functions provided by the respective authors with default parameters." However, specific version numbers for these R functions or packages are not provided, only the academic papers describing the methods.
Experiment Setup	Yes	Networks are generated using seven graphons listed in Table 5 (Appendix) with n = 2000. We also consider sparse counterparts, where edge probabilities follow Eij Bernoulli(ρnf(Ui, Uj)), with ρn = n 1/2 controlling sparsity. We conduct 100 independent trials per conﬁguration and report the average metrics. For the experiments, we set the maximum number of iterations to 500 and the convergence threshold to 10 6.