reproducibilityindex.ai

Systematic Generalization with Edge Transformers

Authors: Leon Bergen, Timothy O'Donnell, Dzmitry Bahdanau

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Edge Transformer on compositional generalization benchmarks in relational reasoning, semantic parsing, and dependency parsing1. In all three settings, the Edge Transformer outperforms Relation-aware, Universal and classical Transformer baselines.In our experiments we compare the systematic generalization ability of Edge Transformers to that of Transformers (Vaswani et al., 2017), Universal Transformers (Dehghani et al., 2019), Relationaware Transformers (Shaw et al., 2018), Graph Attention Networks (Veliˇckovi c et al., 2018) and other baselines.
Researcher Affiliation	Collaboration	Leon Bergen University of California, San Diego lbergen@ucsd.edu Timothy J. O Donnell Mc Gill University Quebec Artiﬁcial Intelligence Institute (Mila) Canada CIFAR AI Chair Dzmitry Bahdanau Element AI, a Service Now company Mc Gill University Quebec Artiﬁcial Intelligence Institute (Mila) Canada CIFAR AI Chair
Pseudocode	No	The paper provides mathematical equations and descriptions of the model's computations but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	1Code for our experiments can be found at: github.com/bergen/Edge Transformer
Open Datasets	Yes	We focus on three synthetic benchmarks with carefully controlled train-test splits, Compositional Language Understanding and Text-based Relational Reasoning (CLUTRR), proposed by Sinha et al. (2019), Compositional Freebase Questions (CFQ) proposed by Keysers et al. (2020) and Compositional Generation Challenge based on Semantic Interpretation (COGS) by Kim and Linzen (2020).
Dataset Splits	No	The paper discusses 'train-test splits' and tuning hyperparameters on a 'random split', implying the use of a validation set. However, it does not provide specific percentages or sample counts for training, validation, and test splits needed to reproduce the data partitioning. For example, it mentions 'the original CLUTRR training set' and 'a larger training set', but no explicit validation set size.
Hardware Specification	Yes	We gratefully acknowledge the support of NVIDIA Corporation with the donation of two Titan V GPUs used for this research.Even after this ﬁltering training an Edge Transformer model on CFQ semantic parsing requires 1-2 days of using 4 NVIDIA V100 GPUs.
Software Dependencies	No	The paper mentions using the 'Stanza framework for dependency parsing' and notes that 'Einstein summation operation which is readily available in modern deep learning frameworks' is used for implementation. However, it does not provide specific version numbers for these or any other software dependencies such as Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	For the chosen hyperparameter settings see Table 1. Table 1: Hyperparameter settings for the Edge Transformer and for the baselines. L is the number of layers, d is the dimensionality, h is the number of heads, B is the batch size, ρ is the learning rate, T is training duration.