Paper tables with annotated results for Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages

Paper

Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages

The expressive power of transformers over inputs of unbounded size can be studied through their ability to recognize classes of formal languages. We consider transformer encoders with hard attention (in which all attention is focused on exactly one position) and strict future masking (in which each position only attends to positions strictly to its left), and prove that they are equivalent to linear temporal logic (LTL), which defines exactly the star-free languages. A key technique is the use of Boolean RASP as a convenient intermediate language between transformers and LTL. We then take numerous results known for LTL and apply them to transformers, characterizing how position embeddings, strict masking, and depth increase expressive power.

PDF Paper record

Results in Papers With Code

(↓ scroll down to see all results)

Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages

Reader Guidelines

Editor Guidelines