no code implementations • 5 Aug 2021 • Haytham ElFadeel, Stan Peshterliev
To reduce computational cost and latency, we propose decoupling the transformer MRC model into input-component and cross-component.
no code implementations • 16 Mar 2021 • Haytham ElFadeel, Stan Peshterliev
In this paper, we explore multi-task learning (MTL) as a second pretraining step to learn enhanced universal language representation for transformer language models.