Probabilistic Graph Rewiring via Virtual Nodes

Authors: Chendi Qian, Andrei Manolache, Christopher Morris, Mathias Niepert

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we validate our approach by showcasing its ability to mitigate under-reaching and oversquashing effects, achieving state-of-the-art performance across multiple graph datasets.
Researcher Affiliation Collaboration 1Computer Science Department, RWTH Aachen University, Germany 2Computer Science Department, University of Stuttgart, Germany 3IMPRS-IS 4Bitdefender, Romania
Pseudocode No The paper describes the message-passing architecture using descriptive text and mathematical equations in Section 3 but does not include a formal pseudocode block or algorithm figure.
Open Source Code Yes An open repository of our code can be accessed at https://github.com/chendiqian/ IPR-MPNN.
Open Datasets Yes Our results demonstrate that IPR-MPNNs effectively account for long-range relationships, achieving state-of-the-art performance on the PEPTIDES and PCQM-CONTACT datasets, as detailed in Table 2. Notably, on the PCQM-CONTACT link prediction tasks, IPR-MPNNs outperform all other candidates across three measurement metrics outlined in Tönshoff et al. [2023]. For QM9, we show in Table 1 that IPR-MPNNs greatly outperform similar methods, obtaining the best results on 12 of 13 target properties. On ZINC and OGB-MOLHIV, we outperform similar MPNNs and graph transformers, namely GPS Rampášek et al. [2022] and SAT [Chen et al., 2022a], obtaining state-of-the-art results; see Table 4. For the TUDATASET collection, we achieve the best results on four of the five molecular datasets; see Table A9.
Dataset Splits Yes We use the official dataset splits when available. Notably, for the TUDATASET, WEBKB datasets [Craven et al., 1998] and heterophilic datasets proposed in Platonov et al. [2023], we perform a 10-Fold Cross-Validation and report the average validation performance, similarly to the other methods that we compare with.
Hardware Specification Yes All experiments were performed on a mixture of A10, A100, A5000, and RTX 4090 NVIDIA GPUs. For each run, we used at most eight CPUs and 64 GB of RAM.
Software Dependencies No The paper mentions that 'All datasets are available through the interface of PyTorch Geometric,' but it does not specify version numbers for PyTorch Geometric or any other software dependencies.
Experiment Setup Yes In all of our real-world experiments, we use two virtual nodes with a hidden dimension twice as large as the base nodes. We randomly initialize the features of the virtual nodes. For the upstream and downstream models, we do a hyperparameter search; see Table A5. We use RWSE and Lap PE positional encodings [Dwivedi et al., 2022a] for all of our experiments as additional node features... We optimize the network using Adam Kingma and Ba [2015] with a cosine annealing learning rate scheduler.