reproducibilityindex.ai

Graph Inductive Biases in Transformers without Message Passing

Authors: Liheng Ma, Chen Lin, Derek Lim, Adriana Romero-Soriano, Puneet K. Dokania, Mark Coates, Philip Torr, Ser-Nam Lim

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	GRIT achieves state-of-the-art empirical performance across a variety of graph datasets, thus showing the power that Graph Transformers without message-passing can deliver. Along with theoretical justification, we provide ample empirical evidence to demonstrate the effectiveness of our design choices. GRIT achieves state-of-the-art empirical performance across a variety of graph learning benchmarks, both small and large-scale.
Researcher Affiliation	Collaboration	1Mc Gill University 2Department of Engineering Science, University of Oxford 3CSAIL, Massachusetts Institute of Technology 4Meta AI 5Mila Quebec AI Institute 6Canada CIFAR AI Chair 7Five AI 8International Laboratory on Learning Systems (ILLS).
Pseudocode	No	The paper includes a visualization of the architecture (Figure 4) and mathematical equations, but no pseudocode or algorithm blocks.
Open Source Code	Yes	The code and models are publicly available at https:// github.com/Liam Ma/GRIT.
Open Datasets	Yes	We evaluate our proposed method on five benchmarks from the Benchmarking GNNs work (Dwivedi et al., 2022a) and two benchmarks from the recently developed Long-Range Graph Benchmark (Dwivedi et al., 2022b). In addition, we also conduct experiments on the larger datasets ZINC-full graphs ( 250,000 graphs) (Irwin et al., 2012) and PCQM4Mv2 ( 3,700,000 graphs) (Hu et al., 2021).
Dataset Splits	Yes	Our experiments are conducted on the standard train/validation/test splits of the evaluated benchmarks. For each dataset, we execute 4 runs with different random seeds (0,1,2,3) and report the mean performance and standard deviation.
Hardware Specification	Yes	The timing is conducted on a single NVIDIA V100 GPU and 20 threads of Intel(R) Xeon(R) GOld 6140 CPU @ 2.30GH.
Software Dependencies	No	The paper mentions using specific datasets from other works and refers to various GNN models, but does not list specific versions for software dependencies like Python, PyTorch, or CUDA.
Experiment Setup	Yes	The final hyperparameters are presented in Tables. 9 and Tables. 10. These tables specify values for '# Transformer Layers', 'Hidden dim', '# Heads', 'Dropout', 'Attention dropout', 'Batch size', 'Learning Rate', '# Epochs', '# Warmup epochs', and 'Weight decay' for various datasets.