VCR-Graphormer: A Mini-batch Graph Transformer via Virtual Connections
Authors: Dongqi Fu, Zhigang Hua, Yan Xie, Jin Fang, Si Zhang, Kaan Sancak, Hao Wu, Andrey Malevich, Jingrui He, Bo Long
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS |
| Researcher Affiliation | Collaboration | University of Illinois Urbana-Champaign, Meta AI, Georgia Institute of Technology {dongqif2, jingrui}@illinois.edu, {kaan}@gatech.edu, {zhua, yanxie, fangjin, sizhang, haowu1, amalevich, bolong}@meta.com |
| Pseudocode | No | The paper describes its methods through text and mathematical equations but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is provided 1. 1https://github.com/Dongqi Fu/VCR-Graphormer |
| Open Datasets | Yes | Totally, we have 13 publicly available graph datasets included in this paper. [...] Small datasets like Pub Med, Cora Full, Computer, Photo, CS, and Physics can be accessed from the DGL library 2. Reddit, Aminer, and Amazon2M are from (Feng et al., 2022) can be accessed by this link 3. [...] Squirrel, Actor, and Texas, where the connected neighbors mostly do not share the same label (i.e., measured by node heterophily in (Zhang et al., 2023)). They can also be accessed from the DGL library. Third, we also include the large-scale heterophious graph dataset, ar Xiv-Year, from the benchmark (Lim et al., 2021). |
| Dataset Splits | Yes | For small-scale datasets and heterophilous datasets, we apply 60%/20%/20% train/val/test random splits. For large-scale datasets, we follow the random splits from (Feng et al., 2022; Chen et al., 2023). |
| Hardware Specification | Yes | The experiments are performed on a Linux machine with a single NVIDIA Tesla V100 32GB GPU. |
| Software Dependencies | No | The paper mentions the DGL library in section 4.1 and states 'DGL itself is a Python library' in section 6.4, but it does not provide specific version numbers for these or any other software dependencies like Python, PyTorch, or other libraries. |
| Experiment Setup | Yes | For attention efficiency, we constrain k and ˆk to be less than 20 in VCR-Graphormer, even for large-scale datasets. [...] Since heterophilous datasets are small, we only allow VCR-Graphormer to take less than 10 global neighbors for Squirrel and Actor datasets, and less than 5 for Texas dataset. [...] For the second component in Eq. 3.3, we used the symmetrically normalized transition matrix P = D 1/2 A D 1/2. For the thrid and fourth component in Eq. 3.3, we used row normalized transition matrix P = D 1A. Also, for connecting content-based super nodes, we only use the label of the training set. |