reproducibilityindex.ai

Long Range Arena : A Benchmark for Efficient Transformers

Authors: Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	3 EXPERIMENTAL RESULTS. Table 1: Experimental results on Long-Range Arena benchmark.
Researcher Affiliation	Industry	1Google Research 2Google Deep Mind {yitay, dehghani}@google.com
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	Our framework, which we plan to open source, is written in JAX/FLAX.
Open Datasets	Yes	We use the IMDb reviews (Maas et al., 2011) dataset, which is a commonly used dataset to benchmark document classiﬁcation. We use the ACL Anthology Network (AAN; Radev et al., 2013) dataset. In LRA, we use the CIFAR-10 dataset (Krizhevsky, 2009) for the image classiﬁcation task.
Dataset Splits	Yes	averaged over 1K random samples from the validation set.
Hardware Specification	Yes	Benchmarks are run on 4x4 TPU V3 Chips. We conduct experiments on 4x4 TPU V3 Chips.
Software Dependencies	No	Our framework, which we plan to open source, is written in JAX/FLAX1. We implement our benchmark (which includes the task, evaluators, and models) in Python 3 and Jax/Flax. No specific version numbers for JAX/FLAX or other libraries are provided.
Experiment Setup	Yes	All our xformer models have an embedding dimension of 512, 8 heads, 6 layers and a feed-forward dimensions of 2048. We train all models for 5K steps. All xformer models are parameterized by the same number of layers, heads and hidden dimensions, namely 8 heads, 512 hidden dimensions and d = 2048 for positional FFN layers. We use 6 layers for all xformers. The learning rate is 0.05 with weight decay of 0.1. We use Adam with warmup. All models are trained for 20K steps and a batch size of 32.