Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Bipartite Stochastic Block Models with Tiny Clusters
Authors: Stefan Neumann
NeurIPS 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the algorithm on synthetic and on real-world data; the experiments show that the algorithm can ο¬nd extremely small clusters even in presence of high destructive noise. |
| Researcher Affiliation | Academia | Stefan Neumann University of Vienna Faculty of Computer Science Vienna, Austria EMAIL |
| Pseudocode | Yes | Algorithm 1 The pcv algorithm Input: G a bipartite m n graph, k, p, q |
| Open Source Code | Yes | We implemented Algorithm 1 in Python. To compute the truncated SVD we used scikit-learn [23]. The source code is available in the supplementary material. |
| Open Datasets | Yes | The source code and the synthetic data are provided in the supplementary materials. and The Book Crossing dataset1 originates from Ziegler et al. [30]. |
| Dataset Splits | No | No explicit statement of dataset splits (e.g., train/validation percentages or counts) was found. The paper mentions generating synthetic data and evaluating on random graphs, but not how a single dataset was partitioned for training, validation, and testing. |
| Hardware Specification | Yes | The experiments were done on a Mac Book Air with a 1.6 GHz Intel Core i5 and 8 GB RAM. |
| Software Dependencies | No | The paper mentions 'Python' and 'scikit-learn [23]' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | When not mentioned otherwise, the parameters were set to n = 1000, k = 8, β= 70, and m = β k (i.e., 1000 vertices on the right, 8 ground-truth clusters on both sides and left-side clusters of size 70). The size of the right-side clusters was set to r = 8. The parameters p and q were set depending on the dataset. |