S3GC: Scalable Self-Supervised Graph Clustering
Authors: Fnu Devvrit, Aditya Sinha, Inderjit Dhillon, Prateek Jain
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate that S3GC is able to learn the correct cluster structure even when graph information or node features are individually not informative enough to learn correct clusters. Finally, using extensive evaluation on a variety of benchmarks, we demonstrate that S3GC is able to significantly outperform state-of-the-art methods in terms of clustering accuracy with as much as 5% gain in NMI while being scalable to graphs of size 100M. |
| Researcher Affiliation | Collaboration | Fnu Devvrit Department of Computer Science University of Texas at Austin devvrit.03@gmail.com Aditya Sinha Google Research Bengaluru, India, 560016 sinhaaditya@google.com Inderjit Dhillon Google & Department of Computer Science University of Texas at Austin isd@google.com Prateek Jain Google Research Bengaluru, India, 560016 prajain@google.com |
| Pseudocode | Yes | Algorithm 1 S3GC: Training and Backpropagation |
| Open Source Code | Yes | Code: Implementation code of S3GC is available at: https://github.com/devvrit/S3GC |
| Open Datasets | Yes | We use 3 small scale, 3 moderate/large scale, and 1 extra large scale dataset from GCN [28], Graph SAGE [19] and the OGB-suite [25] to demonstrate the efficacy of our method. The details of the datasets are given in Table 2 and additional details of the sources are mentioned in Appendix. |
| Dataset Splits | Yes | We perform model selection based on the NMI on the validation set and evaluate all the metrics for this model. |
| Hardware Specification | Yes | We utilize a single Nvidia A100 GPU with 40GB memory for training each method for a maximum duration of 1 hour for each experiment in Table 3. For ogbn-papers100M we allow upto 24 hours of training and upto 300GB main memory in addition. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Hyperparameter Tuning: S3GC requires selection of minimal hyperparameters: we use k = 2 for the k-hop Diffusion Matrix Sk... Additionally tune the learning rate, batch size and random walk parameters, namely the walk length l while using the default values of p = 1 and q = 1 for the bias parameters in the walk. |