Tighter Bounds on the Expressivity of Transformer Encoders

Authors: David Chiang, Peter Cholak, Anand Pillay

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Characterizing neural networks in terms of better-understood formal systems has the potential to yield new insights into the power and limitations of these networks. Doing so for transformers remains an active area of research. Bhattamishra and others have shown that transformer encoders are at least as expressive as a certain kind of counter machine, while Merrill and Sabharwal have shown that fixed-precision transformer encoders recognize only languages in uniform TC0. We connect and strengthen these results by identifying a variant of first-order logic with counting quantifiers that is simultaneously an upper bound for fixed-precision transformer encoders and a lower bound for transformer encoders. This brings us much closer than before to an exact characterization of the languages that transformer encoders recognize.
Researcher Affiliation Academia 1University of Notre Dame, USA.
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks; it relies on mathematical proofs and formal definitions.
Open Source Code No The paper does not mention releasing any open-source code for the methodology described.
Open Datasets No This is a theoretical paper and does not involve training on a dataset.
Dataset Splits No This is a theoretical paper and does not involve dataset splits for validation.
Hardware Specification No This is a theoretical paper and does not describe any hardware used for experiments.
Software Dependencies No This is a theoretical paper and does not list any software dependencies with version numbers.
Experiment Setup No This is a theoretical paper and does not describe any experimental setup details such as hyperparameters or training settings.