reproducibilityindex.ai

S3GC: Scalable Self-Supervised Graph Clustering

Authors: Fnu Devvrit, Aditya Sinha, Inderjit Dhillon, Prateek Jain

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate that S3GC is able to learn the correct cluster structure even when graph information or node features are individually not informative enough to learn correct clusters. Finally, using extensive evaluation on a variety of benchmarks, we demonstrate that S3GC is able to signiﬁcantly outperform state-of-the-art methods in terms of clustering accuracy with as much as 5% gain in NMI while being scalable to graphs of size 100M.
Researcher Affiliation	Collaboration	Fnu Devvrit Department of Computer Science University of Texas at Austin devvrit.03@gmail.com Aditya Sinha Google Research Bengaluru, India, 560016 sinhaaditya@google.com Inderjit Dhillon Google & Department of Computer Science University of Texas at Austin isd@google.com Prateek Jain Google Research Bengaluru, India, 560016 prajain@google.com
Pseudocode	Yes	Algorithm 1 S3GC: Training and Backpropagation
Open Source Code	Yes	Code: Implementation code of S3GC is available at: https://github.com/devvrit/S3GC
Open Datasets	Yes	We use 3 small scale, 3 moderate/large scale, and 1 extra large scale dataset from GCN [28], Graph SAGE [19] and the OGB-suite [25] to demonstrate the efﬁcacy of our method. The details of the datasets are given in Table 2 and additional details of the sources are mentioned in Appendix.
Dataset Splits	Yes	We perform model selection based on the NMI on the validation set and evaluate all the metrics for this model.
Hardware Specification	Yes	We utilize a single Nvidia A100 GPU with 40GB memory for training each method for a maximum duration of 1 hour for each experiment in Table 3. For ogbn-papers100M we allow upto 24 hours of training and upto 300GB main memory in addition.
Software Dependencies	No	The paper mentions 'Py Torch' but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	Hyperparameter Tuning: S3GC requires selection of minimal hyperparameters: we use k = 2 for the k-hop Diffusion Matrix Sk... Additionally tune the learning rate, batch size and random walk parameters, namely the walk length l while using the default values of p = 1 and q = 1 for the bias parameters in the walk.