Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Exact Clustering of Weighted Graphs via Semidefinite Programming
Authors: Aleksis Pirinen, Brendan Ames
JMLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evidence from numerical simulations is also provided to support these theoretical phase transitions to perfect recovery of the cluster structure. 4. Numerical Methods and Simulations We conclude with a discussion of an algorithm for solution of (5) based on the alternating direction method of multipliers (ADMM), and provide the results of a series of experiments that empirically verify the phase transitions predicted in Section 2.2. |
| Researcher Affiliation | Academia | Aleksis Pirinen EMAIL Centre for Mathematical Sciences Lund University Lund, Sweden Brendan Ames EMAIL Department of Mathematics The University of Alabama Tuscaloosa, Alabama, AL 35487-0350, USA |
| Pseudocode | Yes | Algorithm 1 ADMM for (1) Input: Initial iterates X0 = Y 0 = Z0 = 0, augmented Lagrangian parameter ρ > 0, and stopping tolerance ϵ > 0. Output: Approximate solution (X , Y , Z ) of (1). For t = 0, 1, 2 . . . until converged Compute spectral decomposition V t Diagλt(V t)T = U t = Xt (W + Zt)/ρ. Project λt onto the nonnegative simplex {y Rn : e T y = k, y 0} to obtain λt. Update Y t+1 = V t Diag λt(V t)T . Compute approximate optimal solution z of the dual subproblem (33) using spectral projected gradient method of Birgin et al. (2000). Update Xt+1 = h Y ts+1 + Zt/ρ z e+e(z )T Update Zt+1 using approximate dual ascent Zt+1 = Zt ρ(Xt+1 Y t+1). Compute primal feasibilty gap pfeas = min min ij Y t ij, min e Y te . Compute estimates of primal and dual objective values (note that v(t+1) d is not necessarily a lower bound on the optimal dual value, but is asymptotically converging to the optimal dual value): v(t+1) p = Tr(W Y t) v(t+1) d = kλmin(W + Zt+1) Tr(Xt+1Zt+1). Calculate relative duality gap relgap = |v(t+1) p v(t+1) d | max n |v(t+1) p |, 1 o. Declare sequence of iterates to have converged if relgap < ϵ and pfeas > ϵ. End For |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code or a link to a code repository for the methodology described. |
| Open Datasets | No | We perform two sets of experiments, one to illustrate the recovery guarantee for dense graphs sampled from the heterogeneous planted cluster model and another to illustrate the guarantee when the noise is sparse. For the dense graph experiments, we fix n = 1000, and sample 10 graphs from the heterogeneous planted cluster model corresponding to the Bernoulli distributions Ωij = Bern(pij) with probabilities of success pij given by (...) We choose the number of clusters k = n/ˆr and distribute the remaining n kˆr nodes evenly among k 1 clusters to ensure that at least one cluster has minimum size. |
| Dataset Splits | No | The paper describes generating synthetic graphs from a 'planted cluster model' for experiments, rather than using an existing dataset with train/test/validation splits. It specifies parameters for graph generation (n, k, ˆr, pij, q) but not data partitioning for an external dataset. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as CPU, GPU models, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions mathematical methods like the Sherman-Morrison Woodbury Formula and the spectral projected gradient method but does not specify any software libraries or tools with version numbers used for implementing their algorithm. |
| Experiment Setup | Yes | For each graph G, we call the ADMM algorithm sketched above to solve (5); in the algorithm, we use penalty parameter ρ = min {max {5n/k, 80} , 500} /2, stopping tolerance ϵ = 10 4, and maximum number of iterations 100. We declare the block structure of G to be recovered if X X0 2 F / X0 2 F < 10 3, where X is the solution returned by the ADMM algorithm and X0 is the proposed solution given by (3). |