Graph Differentiable Architecture Search with Structure Learning

Authors: Yijian Qin, Xin Wang, Zeyang Zhang, Wenwu Zhu

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct theoretical analysis and measurement study with experiments to discover that gradient based NAS methods tend to select proper architectures based on the usefulness of different types of information with respect to the target task. Our explorations further show that gradient based NAS also suffers from noises hidden in the graph, resulting in searching suboptimal GNN architectures. Based on our findings, we propose a Graph differentiable Architecture Search model with Structure Optimization (GASSO), which allows differentiable search of the architecture with gradient descent and is able to discover graph neural architectures with better performance through employing graph structure learning as a denoising process in the search procedure. Extensive experiments on real-world graph datasets demonstrate that our proposed GASSO model is able to achieve the state-of-the-art performance compared with existing baselines.
Researcher Affiliation Academia Yijian Qin, Xin Wang , Zeyang Zhang, Wenwu Zhu Tsinghua University qinyj19@mails.tsinghua.edu.cn, xin_wang@tsinghua.edu.cn zy-zhang20@mails.tsinghua.edu.cn, wwzhu@tsinghua.edu.cn
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code will be released at https://github.com/THUMNLab/Auto GL
Open Datasets Yes We evaluate our model on three widely used citation benchmark datasets, Cora, Citeseer, and Pubmed, where nodes denote papers and edges denote citation relationship. We conduct further experiments on three larger graph benchmarks: Physics, Cora Full and ogbn-arxiv.
Dataset Splits Yes For Physics and Cora Full, we randomly split train/valid/test set as 50:25:25. For ogbn-arxiv, we follow the default setting.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions software like GCN and GAT but does not provide specific version numbers for any key software components or libraries (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Hyper-parameter settings are listed in Appendix B.2. For our GASSO model, we set the number of layers as 2 in Cora and Cite Seer, 4 in Pub Med. The candidate operations contain GCN [1], GAT [2], GIN [3], MRConv [37] and linear. In the super network, we adopt dropout (p=0.8) before each layer and a Re LU function after each layer.