reproducibilityindex.ai

Spectral Augmentation for Self-Supervised Learning on Graphs

Authors: Lu Lin, Jinghui Chen, Hongning Wang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on both graph and node classification tasks demonstrate the effectiveness of our method in unsupervised learning, as well as the generalization capability in transfer learning and the robustness property under adversarial attacks.
Researcher Affiliation	Academia	Lu Lin , Jinghui Chen , Hongning Wang The Pennsylvania State University, University of Virginia {lulin,jzc5917}@psu.edu, hw5x@virginia.edu
Pseudocode	Yes	Algorithm 1 illustrates the detailed steps of deploying SPAN in an instantiation of GCL. and Algorithm 1: Deploying SPAN in an instantiation of GCL
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	Yes	The proposed SPAN is evaluated on 25 graph datasets. Specifically, for the node classification task, we included Cora, Citeseer, Pub Med citation networks (Sen et al., 2008), Wiki-CS hyperlink network (Mernyei & Cangea, 2020), Amazon-Computer and Amazon-Photo co-purchase network (Shchur et al., 2018), and Coauthor-CS network (Shchur et al., 2018). For the graph classification and regression tasks, we employed TU biochemical and social networks (Morris et al., 2020), Open Graph Benchmark (OGB) (Hu et al., 2020a) and ZINC (Hu et al., 2020b; Gómez-Bombarelli et al., 2018) chemical molecules, and Protein-Protein Interaction (PPI) biological networks (Hu et al., 2020b; Zitnik & Leskovec, 2017).
Dataset Splits	Yes	We adopt the given data split for OGB dataset, and use 10-fold cross validation for TU dataset as it does not provide such a split.
Hardware Specification	Yes	The experiments were performed on Nvidia Ge Force RTX 2080Ti (12GB) GPU cards for most datasets, and RTX A6000 (48GB) cards for Pub Med and Coauthor-CS datasets.
Software Dependencies	No	The paper mentions using PyG (PyTorch Geometric) library for datasets and Adam optimizer, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	For node representation learning, we used GCN (Kipf & Welling, 2017) encoder, and set the number of GCN layers to 2, the size of hidden dimension for each layer to 512. The training epoch is 1000. For graph representation learning, we adopted GIN (Xu et al., 2019) encoder with 5 layers, which was concatenated by a readout function that adds node representations for pooling. The embedding size was set to 32 for TU dataset and 300 for OBG dataset. We used 100 training epochs with batch size 32. In all the experiments, we used the Adam optimizer with learning rate 0.001 and weight decay 10 5. For data augmentation, we adopted both edge perturbation and feature masking, whose perturbation ratio σe and σf were tuned by grid search among {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9} based on the validation set.