reproducibilityindex.ai

Deep and Flexible Graph Neural Architecture Search

Authors: Wentao Zhang, Zheyu Lin, Yu Shen, Yang Li, Zhi Yang, Bin Cui

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies on four node classiﬁcation tasks demonstrate that DFG-NAS outperforms state-of-the-art manual designs and NAS methods of GNNs. (Abstract) ... 5. Experiments and Results
Researcher Affiliation	Academia	1School of CS & Key Laboratory of High Conﬁdence Software Technologies, Peking University 2Institute of Computational Social Science, Peking University (Qingdao), China.
Pseudocode	Yes	Algorithm 1: Searching method.
Open Source Code	Yes	Our code is available in the anonymized repository https://github.com/PKU-DAIR/DFG-NAS.
Open Datasets	Yes	We conduct the experiments on four public graph datasets: three citation graphs (Cora, Citeseer and Pub Med) (Kipf & Welling, 2017), and one large OGB graph (ogbn-arxiv) (Hu et al., 2020). ... Cora, Citeseer, and Pubmed1 are three popular citation network datasets, and we follow the public training/validation/test split in GCN (Kipf & Welling, 2017). ... The public version provided by OGB2is used in our paper. (Footnotes: 1https://github.com/tkipf/gcn/tree/master/gcn/data 2https://ogb.stanford.edu/docs/nodeprop/#ogbn-arxiv)
Dataset Splits	Yes	the node set V is partitioned into training set Vtrain (including both the labeled set Vl and unlabeled set Vu), validation set Vval and test set Vtest. ... We follow the public training/validation/test split for three citation networks and adopt the ofﬁcial split in the OGB graph. ... Table 6. Overview of the Four Datasets. ... #Train/Val/Test
Hardware Specification	Yes	The experiments are conducted on a machine with Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz, and a single NVIDIA TITAN RTX GPU with 24GB GPU memory.
Software Dependencies	Yes	For software versions, we use Python 3.6, Pytorch 1.7.1, and CUDA 10.1.
Experiment Setup	Yes	For the architecture search, the number of the population set k and the maximum generation times T in Algorithm 1 are 20 and 500 for all datasets. For the training of GNN architectures, we follow the same hyper-parameter in their original paper and tune it with Open Box (Li et al., 2021a). The training budget of each searched GNN architecture in DFG-NAS is 200 epochs for three citation networks and 500 epochs for the ogbn-arxiv dataset. Speciﬁcally, we re-training each searched architecture ten times to avoid randomness. These baselines are implemented based on their open-sourced version. Since Auto GNN is not publicly available, we only report its performance on citation graphs following its paper. The details of hyperparameters and reproduction instructions are provided in Appendix A.2 and A.5. ... Speciﬁcally, we train them using Adam optimizer with a learning rate of 0.02 for Cora, 0.03 for Citeseer, 0.1 for Pub Med, and 0.001 for ogbn-arxiv. The regularization factor is 5e-4 for all datasets. We apply dropout to all feature vectors with rates of 0.5 for Cora and Citeseer, and 0.3 for Pub Med and ogbn-arxiv. Besides, the dropout between different GNN layers is 0.8 for Cora and Citeseer, and 0.5 for Pub Med and ogbn-arxiv. At last, the hidden size of each GNN layer is 128 for Cora and ogbn-arxiv, 256 for Citeseer, and 512 for ogbn-arxiv.