1 code implementation • 30 Nov 2023 • Penghui Qi, Xinyi Wan, Guangxing Huang, Min Lin
Pipeline parallelism is one of the key components for large-scale distributed training, yet its efficiency suffers from pipeline bubbles which were deemed inevitable.