Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Structure-Aware Spectral Sparsification via Uniform Edge Sampling

Authors: Kaiwen He, Petros Drineas, Rajiv Khanna

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experiments We empirically validate our theoretical results by comparing uniform edge sampling against effective resistance sampling on synthetic graphs generated by a Stochastic Block Model [1]. We focus on graphs with k = 4 clusters with 200 nodes per cluster. To measure the error, we compute the bottom k = 4 eigenvectors of the sparsified graph, and we measure the largest principal angle between the bottom 4 eigenvectors with the true cluster indicator vectors (ie sin Θ( Vk, C) . Smaller angles indicate better preservation of the cluster structure in the spectral embedding.
Researcher Affiliation	Academia	Kaiwen He Department of Computer Science Purdue University EMAIL Petros Drineas Department of Computer Science Purdue University EMAIL Rajiv Khanna Department of Computer Science Purdue University EMAIL
Pseudocode	No	The paper describes theoretical proofs, theorems, and lemmas, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Question: Does the paper fully disclose all the information needed to reproduce the main experimental results of the paper to the extent that it affects the main claims and/or conclusions of the paper (regardless of whether the code and data are provided or not)? Answer: [Yes] Justification: We will release the experimental code and datasets. In Section 5 we describe the setup for experiments. (...) Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We will release all the code for generating the experimental results. We describe the setup in Section 5.
Open Datasets	Yes	We empirically validate our theoretical results by comparing uniform edge sampling against effective resistance sampling on synthetic graphs generated by a Stochastic Block Model [1]. (...) Experiments are done similarly for a hierarchical stochastic block model. (...) We perform experiments based on the network benchmark graphs by [13].
Dataset Splits	No	The paper describes how synthetic graphs are generated for experiments (e.g., 'We focus on graphs with k = 4 clusters with 200 nodes per cluster', 'pintra-sub = 0.5, pinter-sub = 0.10, pinter-top = 0.005'), but it does not specify explicit training/test/validation dataset splits typically used for reproducing machine learning experiments.
Hardware Specification	Yes	All experiments were run on a Macbook Pro M1 with 16GB of RAM.
Software Dependencies	No	The paper mentions 'computing the pseudoinverse of the unnormalized Laplacian' but does not specify any particular software, libraries, or their version numbers used in the experiments (e.g., Python 3.x, PyTorch 1.x, NumPy 1.x).
Experiment Setup	Yes	We focus on graphs with k = 4 clusters with 200 nodes per cluster. To measure the error, we compute the bottom k = 4 eigenvectors of the sparsified graph, and we measure the largest principal angle between the bottom 4 eigenvectors with the true cluster indicator vectors (ie sin Θ( Vk, C) . Smaller angles indicate better preservation of the cluster structure in the spectral embedding. We evaluate both sampling strategies in two settings. (...) Strong Hierarchical Structure: pintra-sub = 0.5, pinter-sub = 0.10, pinter-top = 0.005 (...) We perform experiments based on the network benchmark graphs by [13]. Experiments are performed for a network of 800 nodes. The mixing parameter µ determines the fraction of edges connecting to others communities, which we vary to generate strong versus weak community structure.