Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On clustering network-valued data

Authors: Soumendu Sundar Mukherjee, Purnamrita Sarkar, Lizhen Lin

NeurIPS 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate our methods using both simulated and real data sets, and theoretical justiﬁcations are provided in terms of consistency.
Researcher Affiliation	Academia	Soumendu Sundar Mukherjee Department of Statistics University of California, Berkeley Berkeley, California 94720, USA EMAIL Purnamrita Sarkar Department of Statistics and Data Sciences University of Texas, Austin Austin, Texas 78712, USA EMAIL Lizhen Lin Department of Applied and Computational Mathematics and Statistics Univeristy of Notre Dame Notre Dame, Indiana 46556, USA EMAIL
Pseudocode	Yes	Algorithm 1 Network Clustering based on Graphon Estimates (NCGE) Algorithm 2 Network Clustering based on Log Moments (NCLM)
Open Source Code	Yes	Code used in this paper is publicly available at https://github.com/soumendu041/clustering-network-valued-data.
Open Datasets	Yes	We cluster about ﬁfty real world networks. We use 11 co-authorship networks between 15,000 researchers from the High Energy Physics corpus of the ar Xiv, 11 co-authorship networks with 21,000 nodes from Citeseer (which had Machine Learning in their abstracts), 17 co-authorship networks (each with about 3000 nodes) from the NIPS conference and ﬁnally 10 Facebook ego networks2. ... 2https://snap.stanford.edu/data/egonets-Facebook.html
Dataset Splits	No	The paper mentions using simulated and real data for experiments but does not provide specific train/validation/test splits or cross-validation details for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers required to reproduce the experiments.
Experiment Setup	No	The paper describes the algorithms and their theoretical properties but does not provide concrete hyperparameter values or system-level training settings for the experiments.