no code implementations • 16 Jul 2023 • Krishna Teja Chitty-Venkata, Sparsh Mittal, Murali Emani, Venkatram Vishwanath, Arun K. Somani
This paper presents a comprehensive survey of techniques for optimizing the inference phase of transformer networks.