Graph Clustering: Block-models and model free results

Authors: Yali Wan, Marina Meila

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 Experimental evaluation Given G, we obtain a clustering C0 by spectral clustering [15]. Then we calculate clustering C by perturbing C0 with gradually increasing noise. For each C, we construct PFM (C, G)and SBM(C, G) model, and further compute , δ and δ0. If δ δ0, C is guaranteed to be stable by the theorems. In the remainder of this section, we describe the data generating process for the simulated datasets and the results we obtained.
Researcher Affiliation Academia Yali Wan Department of Statistics University of Washington Seattle, WA 98195-4322, USA yaliwan@washington.edu Marina Meil a Department of Statistics University of Washington Seattle, WA 98195-4322, USA mmp@stat.washington.edu
Pseudocode Yes PFM Estimation Algorithm Input Graph G with ˆA, ˆD, ˆL, ˆY , ˆΛ, clustering C with indicator matrix Z. Output (A, L) = PFM(G, C) 1. Construct an orthogonal matrix derived from Z. YZ = ˆD1/2ZC 1/2, with C = ZT ˆDZ the column normalization of Z. (5) 2. Project YZ on ˆY and perform Singular Value Decomposition. F = Y T Z ˆY = UΣV T (6) 3. Change basis in R(YZ) to align with ˆY . Y = YZUV T . Complete Y to an orthonormal basis [Y B] of Rn. (7) 4. Construct Laplacian L and edge probability matrix A. L = Y ˆΛY T + (BBT )ˆL(BBT ), A = ˆD1/2L ˆD1/2. (8)
Open Source Code No The paper does not provide explicit statements or links for open-source code for the described methodology.
Open Datasets Yes Political Blogs Dataset A directed network A of hyperlinks between weblogs on US politics, compiled from online directories by Adamic and Glance [2], where each blog is assigned a political leaning, liberal or conservative, based on its blog content. The network A contains 1490 blogs.
Dataset Splits No The paper describes dataset generation parameters and cluster sizes, but does not provide specific training, validation, or test dataset splits.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup No The 'Experiment Setup' section describes the data generation process and computed quantities (ε, δ, δ0) but does not provide specific hyperparameter values or detailed training configurations for any model.