no code implementations • 2 Mar 2024 • Ahmed F. AbouElhamayed, Susanne Balle, Deshanand Singh, Mohamed S. Abdelfattah
Our results consistently demonstrate that end-to-end application performance can easily be dominated by data processing and data movement functions (up to 56% of end-to-end latency in a medium-sized image, and $\sim$ 80% impact on system throughput in a large image), even though these functions have been conventionally overlooked in deep learning system design.