reproducibilityindex.ai

Towards Principled Graph Transformers

Authors: Luis Müller, Daniel Kusuma, Blai Bonet, Christopher Morris

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we demonstrate that the Edge Transformer surpasses other theoretically aligned architectures regarding predictive performance and is competitive with state-of-theart models on algorithmic reasoning and molecular regression tasks while not relying on positional or structural encodings. Our code is available at https: //github.com/luis-mueller/towards-principled-gts.
Researcher Affiliation	Academia	Luis Müller RWTH Aachen University luis.mueller@cs.rwth-aachen.de Daniel Kusuma RWTH Aachen University Blai Bonet Universitat Pompeu Fabra Christopher Morris RWTH Aachen University
Pseudocode	Yes	Algorithm 1 Comparison between standard attention and triangular attention in PYTORCH-like pseudo-code.
Open Source Code	Yes	Our code is available at https: //github.com/luis-mueller/towards-principled-gts.
Open Datasets	Yes	ZINC (12K), ALCHEMY (12K) and ZINC-FULL are available at https://pyg.org under an MIT license. PCQM4MV2 is available at https://ogb.stanford.edu/docs/lsc/pcqm4mv2/ under a CC BY 4.0 license. The CLRS benchmark is available at https://github.com/ google-deepmind/clrs under an Apache 2.0 license. The BREC benchmark is available at https://github.com/Graph PKU/BREC under an MIT license.
Dataset Splits	Yes	For ZINC (12K), ZINC-FULL, PCQM4MV2, CLRS, and BREC, we follow the standard train/validation/test splits.
Hardware Specification	Yes	All experiments were performed on a mix of A10, L40, and A100 NVIDIA GPUs. For each run, we used at most 8 CPU cores and 64 GB of RAM, with the exception of PCQM4MV2 and ZINC-FULL, which were trained on 4 L40 GPUs with 16 CPU cores and 256 GB RAM.
Software Dependencies	No	The paper mentions software like 'PyTorch', 'Jax', and 'Triton [45]' but does not provide specific version numbers for these components, which are necessary for a reproducible description of ancillary software.
Experiment Setup	Yes	Table 6: Hyperparameters of the Edge Transformer across all datasets. This table provides detailed hyperparameters including Learning rate, Grad. clip norm, Batch size, Optimizer, Num. layers, Hidden dim., Num. heads, Activation, Pooling, RRWP dim., Weight decay, Dropout, Attention dropout, # Steps, # Warm-up steps, # Epochs, # Warm-up epochs, # RRWP steps for various datasets.