1 code implementation • LREC 2022 • Niklas Dehio, Malte Ostendorff, Georg Rehm
We investigate an automated approach to extract legal claims from news articles and to match the claims with their corresponding applicable laws.
2 code implementations • LREC 2022 • Rémi Calizzano, Malte Ostendorff, Qian Ruan, Georg Rehm
Almost all summarisation methods and datasets focus on a single language and short summaries.
1 code implementation • GermEval 2021 • Remi Calizzano, Malte Ostendorff, Georg Rehm
Finally, the combination of the two techniques allows us to obtain an F1 score of 0. 6899 with XLM- RoBERTa and 0. 6859 with MT5.
no code implementations • LREC 2022 • Michael Raring, Malte Ostendorff, Georg Rehm
Essential is the automated processing of text segments extracted from different content resources by identifying the relevance of a text segment to a topic and its semantic relation to other text segments.
1 code implementation • ACL (WOAH) 2021 • Dmitrii Aksenov, Peter Bourgonje, Karolina Zaczynska, Malte Ostendorff, Julian Moreno-Schneider, Georg Rehm
We present a data set consisting of German news articles labeled for political bias on a five-point scale in a semi-supervised way.
1 code implementation • 17 Apr 2024 • Orhun Caglidil, Malte Ostendorff, Georg Rehm
However, prior research has primarily focused on the English language, especially in the context of gender bias.
no code implementations • 12 Oct 2023 • Mehdi Ali, Michael Fromm, Klaudia Thellmann, Richard Rutmann, Max Lübbering, Johannes Leveling, Katrin Klug, Jan Ebert, Niclas Doll, Jasper Schulze Buschhoff, Charvi Jain, Alexander Arno Weber, Lena Jurkschat, Hammam Abdelwahab, Chelsea John, Pedro Ortiz Suarez, Malte Ostendorff, Samuel Weinbach, Rafet Sifa, Stefan Kesselheim, Nicolas Flores-Herr
The recent success of Large Language Models (LLMs) has been predominantly driven by curating the training dataset composition, scaling of model architectures and dataset sizes and advancements in pretraining objectives, leaving tokenizer influence as a blind spot.
no code implementations • 15 Jul 2023 • Tim Schopf, Emanuel Gerber, Malte Ostendorff, Florian Matthes
Generic sentence embeddings provide a coarse-grained approximation of semantic textual similarity but ignore specific aspects that make texts similar.
no code implementations • 23 Jan 2023 • Malte Ostendorff, Georg Rehm
To address this problem, we introduce a cross-lingual and progressive transfer learning approach, called CLP-Transfer, that transfers models from a source language, for which pretrained models are publicly available, like English, to a new target language.
1 code implementation • 28 Mar 2022 • Malte Ostendorff, Till Blume, Terry Ruas, Bela Gipp, Georg Rehm
We compare and analyze three generic document embeddings, six specialized document embeddings and a pairwise classification baseline in the context of research paper recommendations.
no code implementations • Findings (ACL) 2022 • Qian Ruan, Malte Ostendorff, Georg Rehm
Using various experimental settings on three datasets (i. e., CNN/DailyMail, PubMed and arXiv), our HiStruct+ model outperforms a strong baseline collectively, which differs from our model only in that the hierarchical structure information is not injected.
Ranked #12 on Text Summarization on Pubmed
1 code implementation • 14 Feb 2022 • Malte Ostendorff, Nils Rethmeier, Isabelle Augenstein, Bela Gipp, Georg Rehm
Learning scientific document representations can be substantially improved through contrastive learning objectives, where the challenge lies in creating positive and negative training samples that encode the desired similarity semantics.
Ranked #1 on Document Classification on SciDocs (MeSH)
1 code implementation • 16 Sep 2021 • Malte Ostendorff, Corinna Breitinger, Bela Gipp
We conclude that users of literature recommendation systems can benefit most from hybrid approaches that combine both link- and text-based approaches, where the user's information needs and preferences should control the weighting for the approaches used.
1 code implementation • 28 Apr 2021 • Malte Ostendorff, Elliott Ash, Terry Ruas, Bela Gipp, Julian Moreno-Schneider, Georg Rehm
Simultaneously, legal recommender systems are typically evaluated in small-scale user study without any public available benchmark datasets.
1 code implementation • COLING 2020 • Malte Ostendorff, Terry Ruas, Till Blume, Bela Gipp, Georg Rehm
Our findings motivate future research of aspect-based document similarity and the development of a recommender system based on the evaluated techniques.
no code implementations • 1 Aug 2020 • Malte Ostendorff
In this doctoral thesis, we explore contextual document similarity measures, i. e., methods that determine document similarity as a triple of two documents and the context of their similarity.
no code implementations • 27 May 2020 • Malte Ostendorff, Till Blume, Saskia Ostendorff
Recent advances in the area of legal information systems have led to a variety of applications that promise support in processing and accessing legal documents.
no code implementations • 25 Apr 2020 • Georg Rehm, Peter Bourgonje, Stefanie Hegele, Florian Kintzel, Julián Moreno Schneider, Malte Ostendorff, Karolina Zaczynska, Armin Berger, Stefan Grill, Sören Räuchle, Jens Rauenbusch, Lisa Rutenburg, André Schmidt, Mikka Wild, Henry Hoffmann, Julian Fink, Sarah Schulz, Jurica Seva, Joachim Quantz, Joachim Böttger, Josefine Matthey, Rolf Fricke, Jan Thomsen, Adrian Paschke, Jamal Al Qundus, Thomas Hoppe, Naouel Karam, Frauke Weichhardt, Christian Fillies, Clemens Neudecker, Mike Gerber, Kai Labusch, Vahid Rezanezhad, Robin Schaefer, David Zellhöfer, Daniel Siewert, Patrick Bunk, Lydia Pintscher, Elena Aleynikova, Franziska Heine
In all domains and sectors, the demand for intelligent systems to support the processing and generation of digital content is rapidly increasing.
no code implementations • 25 Apr 2020 • Georg Rehm, Karolina Zaczynska, Julián Moreno-Schneider, Malte Ostendorff, Peter Bourgonje, Maria Berger, Jens Rauenbusch, André Schmidt, Mikka Wild
Previous work of ours on Semantic Storytelling uses text analytics procedures including Named Entity Recognition and Event Detection.
no code implementations • LREC 2020 • Sarah Schulz, Jurica Ševa, Samuel Rodriguez, Malte Ostendorff, Georg Rehm
We present a new corpus comprising annotations of medical entities in case reports, originating from PubMed Central's open access library.
4 code implementations • 22 Mar 2020 • Malte Ostendorff, Terry Ruas, Moritz Schubotz, Georg Rehm, Bela Gipp
In this paper, we model the problem of finding the relationship between two documents as a pairwise document classification task.
1 code implementation • KONVENS / GermEval 2019 2019 • Malte Ostendorff, Peter Bourgonje, Maria Berger, Julian Moreno-Schneider, Georg Rehm, Bela Gipp
In this paper, we focus on the classification of books using short descriptive texts (cover blurbs) and additional metadata.