no code implementations • 26 Oct 2022 • Andrew Audibert, Yang Chen, Dan Graur, Ana Klimovic, Jiri Simsa, Chandramohan A. Thekkath
To avoid data stalls, the host CPU and RAM required for input data processing per accelerator core used for ML computations varies across jobs.
no code implementations • 23 Mar 2022 • Paul Barham, Aakanksha Chowdhery, Jeff Dean, Sanjay Ghemawat, Steven Hand, Dan Hurt, Michael Isard, Hyeontaek Lim, Ruoming Pang, Sudip Roy, Brennan Saeta, Parker Schuh, Ryan Sepassi, Laurent El Shafey, Chandramohan A. Thekkath, Yonghui Wu
We present the design of a new large scale orchestration layer for accelerators.