Ad-hoc Document Retrieval using Weak-Supervision with BERT and GPT2

EMNLP 2020  ·  Yosi Mass, Haggai Roitman ·

We describe a weakly-supervised method for training deep learning models for the task of ad-hoc document retrieval. Our method is based on generative and discriminative models that are trained using weak-supervision just from the documents in the corpus. We present an end-to-end retrieval system that starts with traditional information retrieval methods, followed by two deep learning re-rankers. We evaluate our method on three different datasets: a COVID-19 related scientific literature dataset and two news datasets. We show that our method outperforms state-of-the-art methods; this without the need for the expensive process of manually labeling data.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here