1 code implementation • 29 Apr 2024 • Pat Verga, Sebastian Hofstatter, Sophia Althammer, Yixuan Su, Aleksandra Piktus, Arkady Arkhangorodsky, Minjie Xu, Naomi White, Patrick Lewis
As Large Language Models (LLMs) have become more advanced, they have outpaced our abilities to accurately evaluate their quality.
no code implementations • 12 Sep 2023 • Sophia Althammer, Guido Zuccon, Sebastian Hofstätter, Suzan Verberne, Allan Hanbury
We further find that gains provided by AL strategies come at the expense of more assessments (thus higher annotation costs) and AL strategies underperform random selection when comparing effectiveness given a fixed annotation cost.
1 code implementation • 24 May 2023 • Mete Sertkan, Sophia Althammer, Sebastian Hofstätter
In this paper, we introduce Ranger - a toolkit to facilitate the easy use of effect-size-based meta-analysis for multi-task evaluation in NLP and IR.
1 code implementation • 14 Aug 2022 • Sophia Althammer, Sebastian Hofstätter, Suzan Verberne, Allan Hanbury
Robust test collections are crucial for Information Retrieval research.
no code implementations • 24 Mar 2022 • Sebastian Hofstätter, Omar Khattab, Sophia Althammer, Mete Sertkan, Allan Hanbury
Recent progress in neural information retrieval has demonstrated large gains in effectiveness, while often sacrificing the efficiency and interpretability of the neural model compared to classical approaches.
1 code implementation • 5 Jan 2022 • Sophia Althammer, Sebastian Hofstätter, Mete Sertkan, Suzan Verberne, Allan Hanbury
However in the web domain we are in a setting with large amounts of training data and a query-to-passage or a query-to-document retrieval task.
2 code implementations • 2 Jan 2022 • Sebastian Hofstätter, Sophia Althammer, Mete Sertkan, Allan Hanbury
We present strong Transformer-based re-ranking and dense retrieval baselines for the recently released TripClick health ad-hoc retrieval collection.
1 code implementation • 11 Oct 2021 • Sebastian Hofstätter, Sophia Althammer, Mete Sertkan, Allan Hanbury
We describe our workflow to create an engaging remote learning experience for a university course, while minimizing the post-production time of the educators.
no code implementations • WNUT (ACL) 2021 • Malte Feucht, Zhiliang Wu, Sophia Althammer, Volker Tresp
ICD-9 coding is a relevant clinical billing task, where unstructured texts with information about a patient's diagnosis and treatments are annotated with multiple ICD-9 codes.
1 code implementation • 9 Aug 2021 • Sophia Althammer, Arian Askari, Suzan Verberne, Allan Hanbury
We address this challenge by combining lexical and dense retrieval methods on the paragraph-level of the cases for the first stage retrieval.
1 code implementation • 10 Jun 2021 • Sophia Althammer, Mark Buckley, Sebastian Hofstätter, Allan Hanbury
Domain-specific contextualized language models have demonstrated substantial effectiveness gains for domain-specific downstream tasks, like similarity matching, entity recognition or information retrieval.
1 code implementation • 18 Jan 2021 • Sebastian Hofstätter, Aldo Lipani, Sophia Althammer, Markus Zlabinger, Allan Hanbury
In this work we analyze position bias on datasets, the contextualized representations, and their effect on retrieval results.
1 code implementation • 21 Dec 2020 • Sophia Althammer, Sebastian Hofstätter, Allan Hanbury
For reproducibility and transparency as well as to benefit the community we make our source code and the trained models publicly available.
1 code implementation • 6 Oct 2020 • Sebastian Hofstätter, Sophia Althammer, Michael Schröder, Mete Sertkan, Allan Hanbury
Based on this finding, we propose a cross-architecture training procedure with a margin focused loss (Margin-MSE), that adapts knowledge distillation to the varying score output distributions of different BERT and non-BERT passage ranking architectures.