Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On clustering network-valued data
Authors: Soumendu Sundar Mukherjee, Purnamrita Sarkar, Lizhen Lin
NeurIPS 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate our methods using both simulated and real data sets, and theoretical justifications are provided in terms of consistency. |
| Researcher Affiliation | Academia | Soumendu Sundar Mukherjee Department of Statistics University of California, Berkeley Berkeley, California 94720, USA EMAIL Purnamrita Sarkar Department of Statistics and Data Sciences University of Texas, Austin Austin, Texas 78712, USA EMAIL Lizhen Lin Department of Applied and Computational Mathematics and Statistics Univeristy of Notre Dame Notre Dame, Indiana 46556, USA EMAIL |
| Pseudocode | Yes | Algorithm 1 Network Clustering based on Graphon Estimates (NCGE) Algorithm 2 Network Clustering based on Log Moments (NCLM) |
| Open Source Code | Yes | Code used in this paper is publicly available at https://github.com/soumendu041/clustering-network-valued-data. |
| Open Datasets | Yes | We cluster about fifty real world networks. We use 11 co-authorship networks between 15,000 researchers from the High Energy Physics corpus of the ar Xiv, 11 co-authorship networks with 21,000 nodes from Citeseer (which had Machine Learning in their abstracts), 17 co-authorship networks (each with about 3000 nodes) from the NIPS conference and finally 10 Facebook ego networks2. ... 2https://snap.stanford.edu/data/egonets-Facebook.html |
| Dataset Splits | No | The paper mentions using simulated and real data for experiments but does not provide specific train/validation/test splits or cross-validation details for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers required to reproduce the experiments. |
| Experiment Setup | No | The paper describes the algorithms and their theoretical properties but does not provide concrete hyperparameter values or system-level training settings for the experiments. |