no code implementations • 6 May 2024 • Tao Yu, Gaurav Gupta, Karthick Gopalswamy, Amith Mamidala, Hao Zhou, Jeffrey Huynh, Youngsuk Park, Ron Diamant, Anoop Deoras, Luke Huan
Large models training is plagued by the intense compute cost and limited hardware memory.