Longformer for MS MARCO Document Re-ranking Task
Two step document ranking, where the initial retrieval is done by a classical information retrieval method, followed by neural re-ranking model, is the new standard. The best performance is achieved by using transformer-based models as re-rankers, e.g., BERT. We employ Longformer, a BERT-like model for long documents, on the MS MARCO document re-ranking task. The complete code used for training the model can be found on: https://github.com/isekulic/longformer-marco
PDF AbstractDatasets
Add Datasets
introduced or used in this paper
Results from the Paper
Submit
results from this paper
to get state-of-the-art GitHub badges and help the
community compare results to other papers.
Methods
Adam •
AdamW •
Attention Dropout •
BERT •
Dense Connections •
Dilated Sliding Window Attention •
Dropout •
GELU •
Global and Sliding Window Attention •
Layer Normalization •
Linear Layer •
Linear Warmup With Linear Decay •
Longformer •
Multi-Head Attention •
Residual Connection •
Scaled Dot-Product Attention •
Sliding Window Attention •
Softmax •
Weight Decay •
WordPiece