Mixture of Weak and Strong Experts on Graphs
Authors: Hanqing Zeng, Hanjia Lyu, Diyi Hu, Yinglong Xia, Jiebo Luo
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, Mowst on 4 backbone GNN architectures show significant accuracy improvement on 6 standard node classification benchmarks, including both homophilous and heterophilous graphs (https://github.com/facebookresearch/mowst-gnn). |
| Researcher Affiliation | Collaboration | Hanqing Zeng Meta AI zengh@meta.com Hanjia Lyu University of Rochester hlyu5@ur.rochester.edu Diyi Hu University of Southern California diyihu@usc.edu Yinglong Xia Meta AI yxia@meta.com Jiebo Luo University of Rochester jluo@cs.rochester.edu |
| Pseudocode | Yes | Algorithm 1 Mowst inference; Algorithm 2 Mowst training |
| Open Source Code | Yes | Empirically, Mowst on 4 backbone GNN architectures show significant accuracy improvement on 6 standard node classification benchmarks, including both homophilous and heterophilous graphs (https://github.com/facebookresearch/mowst-gnn). |
| Open Datasets | Yes | We evaluate Mowst( ) on a diverse set of benchmarks, including 3 homophilous graphs (Flickr (Zeng et al., 2020), ogbn-arxiv and ogbn-products (Hu et al., 2020)) and 3 heterophilous graphs (Penn94, pokec and twitch-gamer (Lim et al., 2021)). |
| Dataset Splits | Yes | We perform node classification using the accuracy metric, with standard training / validation / test splits. |
| Hardware Specification | Yes | All models, including our models and the baselines, are trained on NVIDIA A100 GPUs with 80GB of memory. |
| Software Dependencies | No | The paper mentions "Py Torch and the Py Torch Geometric library" but does not specify version numbers for these software components. |
| Experiment Setup | Yes | HYPERPARAMETERS. For Flickr, ogbn-products and ogbn-arxiv, we follow the original literature (Hu et al., 2020; Zeng et al., 2021) to set the number of layers as 3 and hidden dimension as 256, for all the baselines as well as for both the MLP and GNN experts of Mowst( ). Regarding Penn94, pokec and twitch-gamer, the authors of the original paper (Lim et al., 2021) searched for the best network architecture for each baseline independently. We follow the same protocol and hyperparameter space for our baselines and Mowst( ). We set an additional constraint on Mowst( ) to ensure fair comparison under similar computation costs: we first follow Lim et al. (2021) to determine the number of layers ℓand hidden dimension d of the vanilla GNN baselines, and then set the same ℓand d for the corresponding Mowst models. We use another MLP to implement the learnable G function ( 2.2). To reduce the size of the hyperparameter space, we set the number of layers and hidden dimension of G s MLP the same as those of the experts. See A.2 for the hyperparameter space (e.g., learning rate, dropout) and the grid-search methodology. |