Large-Scale Graph Neural Architecture Search

Authors: Chaoyu Guan, Xin Wang, Hong Chen, Ziwei Zhang, Wenwu Zhu

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate GAUSS on five datasets whose node sizes range from 104 to 108. The experimental results demonstrate substantial improvements of GAUSS over other GNAS baselines on all datasets. Detailed ablation studies further validate the design of our proposed method, including the architecture peer learning on graphs and the architecture importance sampling. Table 2 shows the overall comparisons of GAUSS and previous hand-crafted and automated baselines on all 5 datasets.
Researcher Affiliation Academia 1Media and Network Lab, Department of Computer Science and Technology, Tsinghua University. Correspondence to: Wenwu Zhu <wwzhu@tsinghua.edu.cn>, Xin Wang <xin wang@tsinghua.edu.cn>.
Pseudocode Yes Algorithm 1 A Basic Version of Scalable Supernet Training. Algorithm 2 GAUSS Supernet Training.
Open Source Code Yes 1Code is available at https://www.github.com/ THUMNLab/GAUSS.
Open Datasets Yes We select 5 node classification datasets from OGB (Hu et al., 2020) and GNN Benchmark (Shchur et al., 2018), including graphs of different scales to demonstrate the scalability of our proposed model.
Dataset Splits Yes We report both the validation and test accuracy [%] over 10 runs with different seeds. For the architecture peer learning on graphs, we set the size of the team learning group n to 10, and set αmin to 0.6 at the beginning of the search. At each controller optimization cycle, we train the GRU-based controller for 5 epochs and sample 12 architectures every epoch, using a moving average baseline with an update ratio 0.1. We report the detailed hyper-parameters for supernet training and architecture retraining in Table 6. For Papers100M, we use neighborhood sampling following Graph SAGE (Hamilton et al., 2017), with every layer sampling 12 and 100 neighbors for train and validation/test respectively. The batch size is set to 1024.
Hardware Specification Yes All the experiments are implemented using Py Torch and are conducted on a Tesla V100 GPU with 32GB of memory. GPU: NVIDIA Tesla V100-PCIE-32GB. CPU: Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz.
Software Dependencies Yes Software: Python 3.7.11, Py Torch 1.10.1, Py Torch Geometric 2.0.3 (Fey & Lenssen, 2019), Deep Graph Library 0.7.2 (Wang et al., 2019), Open Graph Benchmark 1.3.2 (Hu et al., 2020).
Experiment Setup Yes Table 6. Detailed hyper-parameters setting. For the architecture peer learning on graphs, we set the size of the team learning group n to 10, and set αmin to 0.6 at the beginning of the search. At each controller optimization cycle, we train the GRU-based controller for 5 epochs and sample 12 architectures every epoch, using a moving average baseline with an update ratio 0.1.