Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Community Detection via Measure Space Embedding
Authors: Mark Kozdoba, Shie Mannor
NeurIPS 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the algorithm on standard random graph benchmarks, including some overlapping community benchmarks, and find its performance to be better or at least as good as previously known algorithms. |
| Researcher Affiliation | Academia | Mark Kozdoba The Technion, Haifa, Israel EMAIL Shie Mannor The Technion, Haifa, Israel EMAIL |
| Pseudocode | Yes | Algorithm 1 DER 1: Input: Graph G, walk length L, number of components k. 2: Compute the measures wi. 3: Initialize P1, . . . , Pk to be a random partition such that |Pi| = |V |/k for all i. 4: repeat 5: (1) For all s k, construct µs = µPs. 6: (2) For all s k, set Ps = i V | s = argmax l D(wi, µl) . 7: until the sets Ps do not change |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for the source code of the methodology described. |
| Open Datasets | Yes | On the empirical side, we first evaluate our algorithm on a set of random graph benchmarks known as the LFR models, [3]. The LFR benchmark model, [14], is a widely used extension of the stochastic block model, where node degrees and community sizes have power law distribution, as often observed in real graphs. Figure 1a shows the classical Zachary s Karate Club, [22]. Figure 1b shows the political blogs graph, [23]. |
| Dataset Splits | No | The paper describes generating graphs from models and then evaluating on them, e.g., "For each combination of graph size, community size restrictions as above and µ value, we generated 20 graphs from that model and run DER." This indicates data generation and evaluation rather than explicit train/validation/test splits for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | In all runs on DER we have set L = 5 and set k to be the true number of communities for each graph, as was done in [4] for the methods that required it. The DER algorithm was run with L = 2, and k was set to the true number of communities. |