Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues

Authors: Soumendu Sundar Mukherjee, Purnamrita Sarkar, Y. X. Rachel Wang, Bowei Yan

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Figure 1-(a), we have generated a network from an SBM with parameters p = 0.4, q = 0.025, and two equal sized blocks of 100 nodes each. We generate 5000 initializations ψ(0) from Beta(α, β) n and map them to a(0) 1. We perform sample BCAVI updates on ψ(0) with known p, q and color the points in the a(0) 1 co-ordinates according the limit points they have converged to.
Researcher Affiliation	Academia	Soumendu Sunder Mukherjee Interdisciplinary Statistical Research Unit (ISRU) Indian Statistical Institute, Kolkata Kolkata 700108, India EMAIL Purnamrita Sarkar Department of Statistics and Data Science University of Texas, Austin Austin, TX 78712 EMAIL Y. X. Rachel Wang School of Mathematics and Statistics University of Sydney NSW 2006, Australia EMAIL Bowei Yan Department of Statistics and Data Science University of Texas, Austin Austin, TX 78712 EMAIL
Pseudocode	No	The paper describes updates using mathematical equations (e.g., equations 4, 8, 9, 10), but does not provide structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about open-sourcing code for the methodology or links to code repositories.
Open Datasets	No	The paper states: "In Figure 1-(a), we have generated a network from an SBM with parameters p = 0.4, q = 0.025, and two equal sized blocks of 100 nodes each." As the data is generated, it's not a publicly available dataset in the conventional sense that requires external access information.
Dataset Splits	No	The paper describes generating synthetic data and performing simulations, but it does not specify train/validation/test dataset splits, cross-validation, or reference any predefined splits for a dataset.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., libraries, frameworks, or programming language versions) used for the experiments.
Experiment Setup	Yes	In Figure 1-(a), we have generated a network from an SBM with parameters p = 0.4, q = 0.025, and two equal sized blocks of 100 nodes each. We generate 5000 initializations ψ(0) from Beta(α, β) n (for four sets of α and β)." and "For each c0, we initialize ψ(0) such that E(ψ(0)) = (1/2 + c0)1C1 + (1/2 c0)1C2 with iid noise. The y-axis shows the average distance between ψ(20) and the true Z from 500 such initializations." Also, "For every choice of p, q, a network of size 400 with two equal sized blocks was generated."