Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Recovering Communities in the General Stochastic Block Model Without Knowing the Parameters
Authors: Emmanuel Abbe, Colin Sandon
NeurIPS 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We tested a simplified version of our algorithm on real data (see [28]), for the blog network of Adamic and Glance 05. We obtained an error rate of about 60/1222 (best trial was 57, worst 67), achieving the state-of-the-art (as described in [32]). |
| Researcher Affiliation | Academia | Emmanuel Abbe Department of Electrical Engineering and PACM Princeton University Princeton, NJ 08540 EMAIL Colin Sandon Department of Mathematics Princeton University Princeton, NJ 08540 EMAIL |
| Pseudocode | Yes | 3.1.1 Simplified version of the algorithm for the symmetric case... 1. Set r = 3/4 log n/ log d and put each of the graph s edges in E with probability 1/10. 2. Set kmax = 1/δ and select kmax ln(4kmax) random vertices, v1, ..., vkmax ln(4kmax). 3. Compute Ir,r [E](vi vj) for each i and j. 4. If there is a possible assignment... 5. For every v in the graph, guess that v is in the same community as the v[i] that maximizes the value of Ir,r [E](v[i] v ). Also, "The Agnostic-degree-profiling algorithm. The inputs are (G, γ)... (1) Define the graph g... (2) Run Agnostic-sphere-comparison... (3) Determine the size... (4) For each node v... (5) Use σ v to get new estimates... (6) For each node v..." |
| Open Source Code | No | The paper does not provide any concrete access to source code, such as a repository link or an explicit statement about code release. |
| Open Datasets | Yes | We tested a simplified version of our algorithm on real data (see [28]), for the blog network of Adamic and Glance 05. |
| Dataset Splits | No | The paper mentions testing on real data but does not provide specific dataset split information (e.g., percentages or counts) for training, validation, or test sets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, or specific solvers used. |
| Experiment Setup | No | The paper describes algorithms and tests them on real data, but it does not provide specific experimental setup details such as concrete hyperparameter values, training configurations, or system-level settings. |