Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Mixing Time of Metropolis-Hastings for Bayesian Community Detection

Authors: Bumeng Zhuo, Chao Gao

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	followed by some numerical results demonstrating its competitive performance on simulated data sets in Section 4. Section 4. Numerical Results In this section, we study the numerical performance of the Metropolis-Hastings algorithm 1, and the inverse temperature parameter ξ is set to be 1 unless otherwise speciﬁed. The initial label assignment vector is chosen such that: half samples are labeled correctly, and the other half samples are labeled randomly. The same mechanism is also mentioned in paper Bickel and Chen (2009).
Researcher Affiliation	Academia	Bumeng Zhuo EMAIL Chao Gao EMAIL Department of Statistics University of Chicago Chicago, IL 60637, USA
Pseudocode	Yes	Algorithm 1: A Metropolis-Hastings algorithm for Bayesian community detection Input: Adjacency matrix A {0, 1}n n, number of communities K, initial community assignment Z0, inverse temperature parameter ξ, maximum number of iterations T. Output: Community label assignment ZT . for each t {0, 1, 2, . . . , T} do Choose an index j [n] uniformly at random; Randomly assign a new label for index j from the set [K] \ {Zt(j)} to get a new assignment Z ; Zt+1 = Z with probability ρ(Zt, Z ) = min 1, Πξ(Z \|A) otherwise set Zt+1 = Zt.
Open Source Code	Yes	1. The code is available on https://github.com/zhuobumeng/MH_bayes_SBM.
Open Datasets	No	In this section, we study the numerical performance of the Metropolis-Hastings algorithm 1, and the inverse temperature parameter ξ is set to be 1 unless otherwise speciﬁed. The initial label assignment vector is chosen such that: half samples are labeled correctly, and the other half samples are labeled randomly. The same mechanism is also mentioned in paper Bickel and Chen (2009). Balanced networks. In this setting, we generate networks with 2500 nodes, and 5 communities, each of which consists of 500 nodes. Heterogeneous networks. In this setting, we generate networks with 2000 nodes and 4 communities of sizes 200, 400, 600, and 800, respectively.
Dataset Splits	No	The paper uses simulated data for its experiments. It describes how networks are generated for different scenarios (e.g., 'Balanced networks. In this setting, we generate networks with 2500 nodes, and 5 communities...'). There is no mention of splitting an existing dataset into training, validation, or test sets.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the numerical experiments or simulations.
Software Dependencies	No	The paper mentions the 'Metropolis-Hastings algorithm 1' and provides a link to its code. However, it does not specify any particular programming languages, libraries, or solvers with their respective version numbers that were used to implement the algorithm or conduct the experiments.
Experiment Setup	Yes	In this section, we study the numerical performance of the Metropolis-Hastings algorithm 1, and the inverse temperature parameter ξ is set to be 1 unless otherwise speciﬁed. The initial label assignment vector is chosen such that: half samples are labeled correctly, and the other half samples are labeled randomly. ... Balanced networks. In this setting, we generate networks with 2500 nodes, and 5 communities, each of which consists of 500 nodes. ... Heterogeneous networks. In this setting, we generate networks with 2000 nodes and 4 communities of sizes 200, 400, 600, and 800, respectively. The connectivity matrix is set as [0.50 0.29 0.35 0.25; 0.29 0.45 0.25 0.30; 0.35 0.25 0.50 0.35; 0.25 0.30 0.35 0.45]. ... In each setting, we run 20 experiments with independent initializations and adjacency matrices, and the value of each block is the average number of misclassiﬁed samples in the 20 experiments.