Graph-based Semi-supervised Local Clustering with Few Labeled Nodes
Authors: Zhaiming Shen, Ming-Jun Lai, Sheng Li
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on various datasets demonstrate the effectiveness of our approach. Extensive experiments are conducted on various benchmark datasets to show our approach outperforms its counterparts [Lai and Mckenzie, 2020; Lai and Shen, 2023]. Results also show that our approach is favorable than many other state-of-the-art semi-supervised clustering algorithms. |
| Researcher Affiliation | Academia | Zhaiming Shen1 , Ming-Jun Lai1 and Sheng Li2 1University of Georgia, Athens, GA, USA 2University of Virginia, Charlottesville, VA, USA {zhaiming.shen, mjlai}@uga.edu, shengli@virginia.edu |
| Pseudocode | Yes | Algorithm 1 Compressive Sensing of Local Cluster Extraction (CS-LCE) |
| Open Source Code | Yes | We make the supplement and code available at: https://github.com/zzzzms/Local Clustering. |
| Open Datasets | Yes | We use simulated stochastic block model, simulated geometric data with three particular shapes, network data on political blogs[Adamic and Glance, 2005], Opt Digits1, AT&T Database of Faces2, MNIST3, and USPS4 as our benchmark datasets. (Footnotes provide URLs for Opt Digits, AT&T Database of Faces, MNIST, USPS, and [Adamic and Glance, 2005] is cited). |
| Dataset Splits | No | The paper mentions 'label ratios' for seeds (e.g., 10% in Table 3) but does not provide specific training, validation, or test dataset splits (e.g., '70% training, 15% validation, 15% test') or references to standard splits that define these partitions. |
| Hardware Specification | No | No specific hardware details (e.g., CPU/GPU models, memory, or processing units) used for the experiments were provided. |
| Software Dependencies | No | No specific software dependencies or versions (e.g., library names with version numbers) were mentioned for reproducibility. |
| Experiment Setup | Yes | Parameter: Estimated size ˆn1 |C1|, random walk threshold parameter ϵ (0, 1), random walk depth t Z+, sparsity parameter γ [0.1, 0.5], rejection parameter R [0.1, 0.9] and For convenience, let us fix γ = 0.4 for the rest of discussion. Also: we fix k = 3 and vary n among 600, 1200, 1800, 2400, 3000. We choose p = 5 log n/n, q = log n/n. With five labeled vertices as seeds and we randomly select 10 seeds for each of the cluster. |