GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts

Authors: Shirley Wu, Kaidi Cao, Bruno Ribeiro, James Y. Zou, Jure Leskovec

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform systematic experiments on both real-world (Section 4.1) and synthetic datasets (Section 4.2) to validate the generalizability of Graph METRO under complex distribution shifts.
Researcher Affiliation Academia Shirley Wu Stanford University shirwu@cs.stanford.edu Kaidi Cao Stanford University kaidicao@cs.stanford.edu Bruno Ribeiro Purdue University ribeirob@purdue.edu James Zou Stanford University jamesz@cs.stanford.edu Jure Leskovec Stanford University jure@cs.stanford.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code and data are available at https://github.com/Wuyxin/Graph METRO.
Open Datasets Yes Datasets. We use four classification datasets, i.e., Web KB [51], Twitch [55], Twitter [78], and Graph SST2 [78, 58], using the dataset splits from the GOOD benchmark [20], which exhibit various real-world covariate shifts.
Dataset Splits Yes We randomly split each dataset into training (80%), validation (20%), and testing (20%) subsets.
Hardware Specification No The paper states 'We have provided the information on our GPUs used for training' in the checklist, but does not specify particular GPU models, quantities, or other hardware details within the paper's content.
Software Dependencies No The paper mentions 'PyG' but does not specify its version or the versions of other software dependencies like PyTorch.
Experiment Setup Yes We summarize the model architecture and hyperparameters for synthetic experiments (Section 4.2) in Table 2. We use the Adam optimizer with weight decay set to 0. The encoder (backbone) architecture, including the number of layers and hidden dimensions, is selected based on validation performance from the ERM model and then fixed for each encoder during Graph METRO training. For all datasets, we conduct a grid search for Graph METRO learning rates due to the difference in architecture compared to traditional GNN models.