reproducibilityindex.ai

Graph Neural Architecture Search

Authors: Yang Gao, Hong Yang, Peng Zhang, Chuan Zhou, Yue Hu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on real-world datasets demonstrate that Graph NAS can design a novel network architecture that rivals the best human-invented architecture in terms of validation set accuracy. Moreover, in a transfer learning task we observe that graph neural architectures designed by Graph NAS, when transferred to new datasets, still gain improvement in terms of prediction accuracy.
Researcher Affiliation	Collaboration	Yang Gao 1,5 , Hong Yang 2 , Peng Zhang 3 , Chuan Zhou 4,5 and Yue Hu 1,5 1Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 2Centre for Artiﬁcial Intelligence, University of Technology Sydney, Australia 3Ant Financial Services Group, Hangzhou, China 4Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China 5School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Pseudocode	Yes	Algorithm 1 Graph NAS search algorithm
Open Source Code	Yes	We have released the python codes on Github1 for comparison. 1https://github.com/Graph NAS/Graph NAS
Open Datasets	Yes	Datasets. We use three popular citation networks, i.e., Cora, Citeseer and Pubmed, as the testbed. To test the capability of transferring the architectures designed by Graph NAS, we use the co-author datasets of MS-CS and MS-Physics, and the product networks of Amazon Computers and Amazon Photos [Shchur et al., 2018].
Dataset Splits	Yes	In the semi-supervised learning task, the datasets follow the settings of [Kipf and Welling, 2017]. During training, only 20 labels per class are used for each citation network, 500 nodes in total for validation and 1,000 nodes for testing. In the supervised learning task, 500 nodes in each dataset are selected as the validation set and 500 nodes are selected as the test set. The rest of nodes are selected from the graph as training data. Each split contains 500 nodes for evaluation, 500 nodes for test, and the rest for training.
Hardware Specification	Yes	The experiments are tested on a single NVIDIA 1080Ti.
Software Dependencies	No	The GNN architectures used in Graph NAS are implemented by PYG [Fey and Lenssen, 2019]. While PYG is mentioned, specific version numbers for PYG or other key software dependencies (e.g., Python, PyTorch) are not provided, which is necessary for reproducibility.
Experiment Setup	Yes	Hyper-parameters of the controller: The controller is a one-layer LSTM with 100 hidden units. It is trained with the ADAM optimizer with a learning rate of 0.00035. The weights of the controller are initialized uniformly between -0.1 and 0.1. To prevent premature convergence, we also use a tanh of 2.5 and a temperature of 5.0 for the sampling logits [Bello et al., 2017], and add the controller s sample entropy to the reward, weighted by 0.0001. After Graph NAS searches S = 2000 architectures, we collect the top K = 5 architectures that achieve the best validation accuracy. Then we train those model for N = 20 times to choose the best models. Each GNN designed by Graph NAS contains L = 2 layers for fair comparisons. Hyper-parameters of GNNs: Once the controller samples an architecture, a child model is constructed and trained for 300 epochs. We apply the L2 regularization with λ = 0.0005, dropout probability p = 0.6, and learning rate lr = 0.005 as the default parameters. To achieve the best results, the hyper-parameters of the GNN models are searched over the following search space: Hidden size: [8, 16, 32, 64, 128, 256, 512] Learning rate: [1e-2, 1e-3, 1e-4, 5e-3, 5e-4] Dropout: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] L2 regularization strength: [0, 1e-3, 1e-4, 1e-5, 5e-5, 5e-4]