Search Results for author: Malte Ostendorff

Found 22 papers, 12 papers with code

Claim Extraction and Law Matching for COVID-19-related Legislation

1 code implementation • LREC 2022 • Niklas Dehio, Malte Ostendorff, Georg Rehm

We investigate an automated approach to extract legal claims from news articles and to match the claims with their corresponding applicable laws.

Legal Reasoning

Paper
Code

Generating Extended and Multilingual Summaries with Pre-trained Transformers

2 code implementations • LREC 2022 • Rémi Calizzano, Malte Ostendorff, Qian Ruan, Georg Rehm

Almost all summarisation methods and datasets focus on a single language and short summaries.

Paper
Code

DFKI SLT at GermEval 2021: Multilingual Pre-training and Data Augmentation for the Classification of Toxicity in Social Media Comments

1 code implementation • GermEval 2021 • Remi Calizzano, Malte Ostendorff, Georg Rehm

Finally, the combination of the two techniques allows us to obtain an F1 score of 0. 6899 with XLM- RoBERTa and 0. 6859 with MT5.

Data Augmentation

Paper
Code

Semantic Relations between Text Segments for Semantic Storytelling: Annotation Tool - Dataset - Evaluation

no code implementations • LREC 2022 • Michael Raring, Malte Ostendorff, Georg Rehm

Essential is the automated processing of text segments extracted from different content resources by identifying the relevance of a text segment to a topic and its semantic relation to other text segments.

Sentence

Paper
Add Code

Fine-grained Classification of Political Bias in German News: A Data Set and Initial Experiments

1 code implementation • ACL (WOAH) 2021 • Dmitrii Aksenov, Peter Bourgonje, Karolina Zaczynska, Malte Ostendorff, Julian Moreno-Schneider, Georg Rehm

We present a data set consisting of German news articles labeled for political bias on a five-point scale in a semi-supervised way.

Bias Detection Binary Classification +1

Paper
Code

Investigating Gender Bias in Turkish Language Models

1 code implementation • 17 Apr 2024 • Orhun Caglidil, Malte Ostendorff, Georg Rehm

However, prior research has primarily focused on the English language, especially in the context of gender bias.

Attribute

Paper
Code

Tokenizer Choice For LLM Training: Negligible or Crucial?

no code implementations • 12 Oct 2023 • Mehdi Ali, Michael Fromm, Klaudia Thellmann, Richard Rutmann, Max Lübbering, Johannes Leveling, Katrin Klug, Jan Ebert, Niclas Doll, Jasper Schulze Buschhoff, Charvi Jain, Alexander Arno Weber, Lena Jurkschat, Hammam Abdelwahab, Chelsea John, Pedro Ortiz Suarez, Malte Ostendorff, Samuel Weinbach, Rafet Sifa, Stefan Kesselheim, Nicolas Flores-Herr

The recent success of Large Language Models (LLMs) has been predominantly driven by curating the training dataset composition, scaling of model architectures and dataset sizes and advancements in pretraining objectives, leaving tokenizer influence as a blind spot.

Paper
Add Code

AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity Using Contrastive Learning and Structured Knowledge

no code implementations • 15 Jul 2023 • Tim Schopf, Emanuel Gerber, Malte Ostendorff, Florian Matthes

Generic sentence embeddings provide a coarse-grained approximation of semantic textual similarity but ignore specific aspects that make texts similar.

Contrastive Learning Information Retrieval +5

Paper
Add Code

Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning

no code implementations • 23 Jan 2023 • Malte Ostendorff, Georg Rehm

To address this problem, we introduce a cross-lingual and progressive transfer learning approach, called CLP-Transfer, that transfers models from a source language, for which pretrained models are publicly available, like English, to a new target language.

Cross-Lingual Transfer Language Modelling +1

Paper
Add Code

Specialized Document Embeddings for Aspect-based Similarity of Research Papers

1 code implementation • 28 Mar 2022 • Malte Ostendorff, Till Blume, Terry Ruas, Bela Gipp, Georg Rehm

We compare and analyze three generic document embeddings, six specialized document embeddings and a pairwise classification baseline in the context of research paper recommendations.

Document Classification Recommendation Systems +1

Paper
Code

HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information

no code implementations • Findings (ACL) 2022 • Qian Ruan, Malte Ostendorff, Georg Rehm

Using various experimental settings on three datasets (i. e., CNN/DailyMail, PubMed and arXiv), our HiStruct+ model outperforms a strong baseline collectively, which differs from our model only in that the hierarchical structure information is not injected.

Ranked #12 on Text Summarization on Pubmed

Extractive Summarization Extractive Text Summarization +2

Paper
Add Code

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings

1 code implementation • 14 Feb 2022 • Malte Ostendorff, Nils Rethmeier, Isabelle Augenstein, Bela Gipp, Georg Rehm

Learning scientific document representations can be substantially improved through contrastive learning objectives, where the challenge lies in creating positive and negative training samples that encode the desired similarity semantics.

Ranked #1 on Document Classification on SciDocs (MeSH)

Citation Prediction Contrastive Learning +3

Paper
Code

A Qualitative Evaluation of User Preference for Link-based vs. Text-based Recommendations of Wikipedia Articles

1 code implementation • 16 Sep 2021 • Malte Ostendorff, Corinna Breitinger, Bela Gipp

We conclude that users of literature recommendation systems can benefit most from hybrid approaches that combine both link- and text-based approaches, where the user's information needs and preferences should control the weighting for the approaches used.

Recommendation Systems

Paper
Code

Evaluating Document Representations for Content-based Legal Literature Recommendations

1 code implementation • 28 Apr 2021 • Malte Ostendorff, Elliott Ash, Terry Ruas, Bela Gipp, Julian Moreno-Schneider, Georg Rehm

Simultaneously, legal recommender systems are typically evaluated in small-scale user study without any public available benchmark datasets.

Recommendation Systems Representation Learning +1

Paper
Code

Aspect-based Document Similarity for Research Papers

1 code implementation • COLING 2020 • Malte Ostendorff, Terry Ruas, Till Blume, Bela Gipp, Georg Rehm

Our findings motivate future research of aspect-based document similarity and the development of a recommender system based on the evaluated techniques.

Document Classification Recommendation Systems

Paper
Code

Contextual Document Similarity for Content-based Literature Recommender Systems

no code implementations • 1 Aug 2020 • Malte Ostendorff

In this doctoral thesis, we explore contextual document similarity measures, i. e., methods that determine document similarity as a triple of two documents and the context of their similarity.

Recommendation Systems

Paper
Add Code

Towards an Open Platform for Legal Information

no code implementations • 27 May 2020 • Malte Ostendorff, Till Blume, Saskia Ostendorff

Recent advances in the area of legal information systems have led to a variety of applications that promise support in processing and accessing legal documents.

Paper
Add Code

QURATOR: Innovative Technologies for Content and Data Curation

no code implementations • 25 Apr 2020 • Georg Rehm, Peter Bourgonje, Stefanie Hegele, Florian Kintzel, Julián Moreno Schneider, Malte Ostendorff, Karolina Zaczynska, Armin Berger, Stefan Grill, Sören Räuchle, Jens Rauenbusch, Lisa Rutenburg, André Schmidt, Mikka Wild, Henry Hoffmann, Julian Fink, Sarah Schulz, Jurica Seva, Joachim Quantz, Joachim Böttger, Josefine Matthey, Rolf Fricke, Jan Thomsen, Adrian Paschke, Jamal Al Qundus, Thomas Hoppe, Naouel Karam, Frauke Weichhardt, Christian Fillies, Clemens Neudecker, Mike Gerber, Kai Labusch, Vahid Rezanezhad, Robin Schaefer, David Zellhöfer, Daniel Siewert, Patrick Bunk, Lydia Pintscher, Elena Aleynikova, Franziska Heine

In all domains and sectors, the demand for intelligent systems to support the processing and generation of digital content is rapidly increasing.

Paper
Add Code

Towards Discourse Parsing-inspired Semantic Storytelling

no code implementations • 25 Apr 2020 • Georg Rehm, Karolina Zaczynska, Julián Moreno-Schneider, Malte Ostendorff, Peter Bourgonje, Maria Berger, Jens Rauenbusch, André Schmidt, Mikka Wild

Previous work of ours on Semantic Storytelling uses text analytics procedures including Named Entity Recognition and Event Detection.

coreference-resolution Discourse Parsing +5

Paper
Add Code

Named Entities in Medical Case Reports: Corpus and Experiments

no code implementations • LREC 2020 • Sarah Schulz, Jurica Ševa, Samuel Rodriguez, Malte Ostendorff, Georg Rehm

We present a new corpus comprising annotations of medical entities in case reports, originating from PubMed Central's open access library.

named-entity-recognition Named Entity Recognition +4

Paper
Add Code

Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles

4 code implementations • 22 Mar 2020 • Malte Ostendorff, Terry Ruas, Moritz Schubotz, Georg Rehm, Bela Gipp

In this paper, we model the problem of finding the relationship between two documents as a pairwise document classification task.

Document Classification General Classification +1

Paper
Code

Enriching BERT with Knowledge Graph Embeddings for Document Classification

1 code implementation • KONVENS / GermEval 2019 2019 • Malte Ostendorff, Peter Bourgonje, Maria Berger, Julian Moreno-Schneider, Georg Rehm, Bela Gipp

In this paper, we focus on the classification of books using short descriptive texts (cover blurbs) and additional metadata.

Classification Descriptive +4

155

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.