Large-Scale Graph Neural Architecture Search
Authors: Chaoyu Guan, Xin Wang, Hong Chen, Ziwei Zhang, Wenwu Zhu
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate GAUSS on five datasets whose node sizes range from 104 to 108. The experimental results demonstrate substantial improvements of GAUSS over other GNAS baselines on all datasets. Detailed ablation studies further validate the design of our proposed method, including the architecture peer learning on graphs and the architecture importance sampling. Table 2 shows the overall comparisons of GAUSS and previous hand-crafted and automated baselines on all 5 datasets. |
| Researcher Affiliation | Academia | 1Media and Network Lab, Department of Computer Science and Technology, Tsinghua University. Correspondence to: Wenwu Zhu <wwzhu@tsinghua.edu.cn>, Xin Wang <xin wang@tsinghua.edu.cn>. |
| Pseudocode | Yes | Algorithm 1 A Basic Version of Scalable Supernet Training. Algorithm 2 GAUSS Supernet Training. |
| Open Source Code | Yes | 1Code is available at https://www.github.com/ THUMNLab/GAUSS. |
| Open Datasets | Yes | We select 5 node classification datasets from OGB (Hu et al., 2020) and GNN Benchmark (Shchur et al., 2018), including graphs of different scales to demonstrate the scalability of our proposed model. |
| Dataset Splits | Yes | We report both the validation and test accuracy [%] over 10 runs with different seeds. For the architecture peer learning on graphs, we set the size of the team learning group n to 10, and set αmin to 0.6 at the beginning of the search. At each controller optimization cycle, we train the GRU-based controller for 5 epochs and sample 12 architectures every epoch, using a moving average baseline with an update ratio 0.1. We report the detailed hyper-parameters for supernet training and architecture retraining in Table 6. For Papers100M, we use neighborhood sampling following Graph SAGE (Hamilton et al., 2017), with every layer sampling 12 and 100 neighbors for train and validation/test respectively. The batch size is set to 1024. |
| Hardware Specification | Yes | All the experiments are implemented using Py Torch and are conducted on a Tesla V100 GPU with 32GB of memory. GPU: NVIDIA Tesla V100-PCIE-32GB. CPU: Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz. |
| Software Dependencies | Yes | Software: Python 3.7.11, Py Torch 1.10.1, Py Torch Geometric 2.0.3 (Fey & Lenssen, 2019), Deep Graph Library 0.7.2 (Wang et al., 2019), Open Graph Benchmark 1.3.2 (Hu et al., 2020). |
| Experiment Setup | Yes | Table 6. Detailed hyper-parameters setting. For the architecture peer learning on graphs, we set the size of the team learning group n to 10, and set αmin to 0.6 at the beginning of the search. At each controller optimization cycle, we train the GRU-based controller for 5 epochs and sample 12 architectures every epoch, using a moving average baseline with an update ratio 0.1. |