Relational Attention: Generalizing Transformers for Graph-Structured Tasks

Authors: Cameron Diao, Ricky Loynd

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate this relational transformer on a diverse array of graph-structured tasks, including the large and challenging CLRS Algorithmic Reasoning Benchmark. Our analysis demonstrates that these gains are attributable to relational attention s inherent ability to leverage the greater expressivity of graphs over sets. We evaluate RT against common GNNs on the diverse set of graph-structured tasks provided by CLRS-30 (Veliˇckovi c et al., 2022).
Researcher Affiliation Collaboration Cameron Diao Department of Computer Science Rice University cwd2@rice.edu Ricky Loynd Microsoft Research riloynd@microsoft.com
Pseudocode No The paper describes mathematical equations and a process, but does not include a block labeled 'Pseudocode' or 'Algorithm'.
Open Source Code Yes We introduce the relational transformer for application to arbitrary graph-structured tasks, and make the implementation available at https://github.com/Cameron Diao/ relational-transformer.
Open Datasets Yes We evaluate RT against common GNNs on the diverse set of graph-structured tasks provided by CLRS-30 (Veliˇckovi c et al., 2022). CLRS-30 provides canonical datasets (training, validation, and test) which can also be generated from specific random seeds: 1, 2, 3.
Dataset Splits Yes CLRS-30 provides canonical datasets (training, validation, and test) which can also be generated from specific random seeds: 1, 2, 3. The graphs in the training and validation datasets contain 16 nodes, while the test graphs are of size 64 to evaluate the out-of-distribution (OOD) generalization of models. During training, the model is evaluated on the validation set after every 320 examples.
Hardware Specification Yes Training speed in examples per second on a T4 GPU, on the reference algorithm Bellman Ford.
Software Dependencies No The paper mentions that 'the CLRS-30 framework is written in Jax' but does not specify a version number for Jax or any other software dependency.
Experiment Setup Yes To tune the hyperparameters of RT and the CLRS-30 baseline GNNs, we used Distributed Grid Descent (DGD) (Loynd et al., 2020), a self-guided form of random search. Table 2 lists the tuned hyperparameter values for CLRS-30 experiments, and Table 3 reports the sets of values considered in those searches.