reproducibilityindex.ai

Graph-based Semi-supervised Local Clustering with Few Labeled Nodes

Authors: Zhaiming Shen, Ming-Jun Lai, Sheng Li

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results on various datasets demonstrate the effectiveness of our approach. Extensive experiments are conducted on various benchmark datasets to show our approach outperforms its counterparts [Lai and Mckenzie, 2020; Lai and Shen, 2023]. Results also show that our approach is favorable than many other state-of-the-art semi-supervised clustering algorithms.
Researcher Affiliation	Academia	Zhaiming Shen1 , Ming-Jun Lai1 and Sheng Li2 1University of Georgia, Athens, GA, USA 2University of Virginia, Charlottesville, VA, USA {zhaiming.shen, mjlai}@uga.edu, shengli@virginia.edu
Pseudocode	Yes	Algorithm 1 Compressive Sensing of Local Cluster Extraction (CS-LCE)
Open Source Code	Yes	We make the supplement and code available at: https://github.com/zzzzms/Local Clustering.
Open Datasets	Yes	We use simulated stochastic block model, simulated geometric data with three particular shapes, network data on political blogs[Adamic and Glance, 2005], Opt Digits1, AT&T Database of Faces2, MNIST3, and USPS4 as our benchmark datasets. (Footnotes provide URLs for Opt Digits, AT&T Database of Faces, MNIST, USPS, and [Adamic and Glance, 2005] is cited).
Dataset Splits	No	The paper mentions 'label ratios' for seeds (e.g., 10% in Table 3) but does not provide specific training, validation, or test dataset splits (e.g., '70% training, 15% validation, 15% test') or references to standard splits that define these partitions.
Hardware Specification	No	No specific hardware details (e.g., CPU/GPU models, memory, or processing units) used for the experiments were provided.
Software Dependencies	No	No specific software dependencies or versions (e.g., library names with version numbers) were mentioned for reproducibility.
Experiment Setup	Yes	Parameter: Estimated size ˆn1 \|C1\|, random walk threshold parameter ϵ (0, 1), random walk depth t Z+, sparsity parameter γ [0.1, 0.5], rejection parameter R [0.1, 0.9] and For convenience, let us fix γ = 0.4 for the rest of discussion. Also: we fix k = 3 and vary n among 600, 1200, 1800, 2400, 3000. We choose p = 5 log n/n, q = log n/n. With five labeled vertices as seeds and we randomly select 10 seeds for each of the cluster.