1 code implementation • 15 Feb 2021 • Max Horn, Kumar Shridhar, Elrich Groenewald, Philipp F. M. Baumann
While Transformer architectures have show remarkable success, they are bound to the computation of all pairwise interactions of input element and thus suffer from limited scalability.