Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Clustering via Hedonic Games: New Concepts and Algorithms

Authors: Gergely Csáji, Alexander Gundert, Jörg Rothe, Ildikó Schlotter

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Exploring these concepts from an algorithmic viewpoint, we design efficient mechanisms for finding locally stable or locally popular partitions. Besides gaining theoretical insight into the computational complexity of these problems, we perform simulations that demonstrate how our algorithms can be successfully applied in clustering and community detection.
Researcher Affiliation	Academia	Gergely Csáji ELTE Centre of Economic and Regional Studies Budapest, Hungary EMAIL Alexander Gundert Heinrich-Heine-Universität Düsseldorf Düsseldorf, Germany EMAIL Jörg Rothe Heinrich-Heine-Universität Düsseldorf Düsseldorf, Germany EMAIL Ildikó Schlotter ELTE Centre of Economic and Regional Studies Budapest, Hungary EMAIL
Pseudocode	Yes	Loc Pop: 1. Start from an arbitrary partition π. 2. While there exists a coalition C π { } and an agent i / C such that Λ(π, πi C) > 0, replace π with πi C and continue. Loc Stab: 1. Start from an arbitrary partition π. 2. While there exists a coalition C π { } and an agent i / C such that Λ(π, πi C) > 0 and C {i} i π(i), replace π with πi C and continue.
Open Source Code	Yes	All codes are available in the supplementary material.
Open Datasets	Yes	Karate club [23, 31, 43]: a 34-node benchmark dataset for community detection. Jazz musicians [24, 32]: collaboration network of 198 jazz musicians; nodes represent musicians, edges represent co-membership in a band. Cora dataset [36, 35]: a citation network of 2708 machine learning papers classified into seven classes; edges denote citation links. Iris dataset [21]: 150 samples from three Iris species, features are sepal and petal sizes. Breast cancer Wisconsin dataset [42]: 569 samples from diagnostic images (30 dimensional datapoints). Moons dataset [38]: two half-moons generated by make_moons (300 points, 0.05 noise).
Dataset Splits	No	Our simulations did not include training, nor data splits.
Hardware Specification	Yes	We implemented Loc Pop and Loc Stab in Python and run the simulations on a computer with AMD Ryzen 7735HS CPU and 16GB RAM.
Software Dependencies	No	The paper mentions 'Python' but does not specify a version number for Python or any libraries used, such as scikit-learn, Louvain, or Leiden.
Experiment Setup	Yes	For our algorithms, we chose parameters f e [0, 1] to create the friendship and enmity graphs: for two data points x and y at distance d(x, y) in an instance I, we added (x, y) to the friendship graph if d(x, y) f diam(I) and to the enmity graph if d(x, y) > e diam(I), where diam(I) denotes the diameter of I, i.e., the maximum distance between any two points in I. We used the parameter values (f, e) {(0.2, 0.2), (0.25, 0.35), (0.4, 0.4)}. For each parameterization, we considered the appreciation-of-friends (AF), aversion-to-enemies (AE), and the balanced (B) preference domains. ... In particular, we considered the following variations for the initial clustering: (i) putting each agent into a singleton cluster (Loc Pop-S, Loc Stab-S), (ii) dividing agents randomly into k clusters where k is the predicted number of clusters (Loc Pop-P, Loc Stab-P), and (iii) using the output of the Leiden algorithm (Loc Pop-Ld, Loc Stab-Ld).