Transformers

BART is a denoising autoencoder for pretraining sequence-to-sequence models. It is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Transformer-based neural machine translation architecture. It uses a standard seq2seq/NMT architecture with a bidirectional encoder (like BERT) and a left-to-right decoder (like GPT). This means the encoder's attention mask is fully visible, like BERT, and the decoder's attention mask is causal, like GPT2.

Source: BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Retrieval 129 13.11%
Question Answering 79 8.03%
Language Modelling 71 7.22%
Text Generation 64 6.50%
Abstractive Text Summarization 45 4.57%
Sentence 40 4.07%
Text Summarization 27 2.74%
Large Language Model 21 2.13%
Information Retrieval 21 2.13%

Categories