no code implementations • 27 Dec 2020 • Keke Zhai, Pan He, Tania Banerjee, Anand Rangarajan, Sanjay Ranka
Besides, it suitably partitions the model when the GPUs are heterogeneous such that the computing is load-balanced with reduced communication overhead.