Graph Structure Extrapolation for Out-of-Distribution Generalization
Authors: Xiner Li, Shurui Gui, Youzhi Luo, Shuiwang Ji
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experimental Studies |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, Texas A&M University, Texas, USA. |
| Pseudocode | No | The paper describes procedures in text and uses mathematical formulations but does not provide clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We adopt 5 datasets from the GOOD benchmark (Gui et al., 2022a), HIV-size, HIV-scaffold, SST2-length, Motif-size, and Motif-base, where -" denotes the shift domain. We construct another natural language dataset Twitter-length (Yuan et al., 2020) following the OOD split of GOOD. Additionally, we adopt protein dataset DD-size and molecular dataset NCI1-size following Bevilacqua et al. (2021). |
| Dataset Splits | Yes | For all experiments, we select the best checkpoints for OOD tests according to results on OOD validation sets; ID validation and ID test are also used for comparison if available. |
| Hardware Specification | Yes | For computation, we generally use one NVIDIA Ge Force RTX 2080 Ti for each single experiment. |
| Software Dependencies | No | The paper mentions software components like "GIN-Virtual Node", "GIN", "Graph SAINT", "GCN", and "Adam optimizer" but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | For all the experiments, we use the Adam optimizer, with a weight decay tuned from the set {0, 1e-2, 1e-3, 1e-4} and a dropout rate of 0.5. The number of convolutional layers in GNN models for each dataset is tuned from the set {3, 5}. We use mean global pooling and the RELU activation function, and the dimension of the hidden layer is 300. We select the maximum number of epochs from {100, 200, 500}, the initial learning rate from {1e-3, 3e-3, 5e-3, 1e-4}, and the batch size from {32, 64, 128} for graph-level and {1024, 4096} for node-level tasks. All models are trained to converge in the training process. |