TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Text Reranking	Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information Retrieval	BM25_whitespace_tokenizer	MRR@100	14.18	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-eup-the-multilingual-european/text-reranking-on-multi-eup-the-multilingual)](https://paperswithcode.com/sota/text-reranking-on-multi-eup-the-multilingual?p=multi-eup-the-multilingual-european)`

Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information Retrieval

3 Nov 2023 · Jinrui Yang, Timothy Baldwin, Trevor Cohn ·

We present Multi-EuP, a new multilingual benchmark dataset, comprising 22K multi-lingual documents collected from the European Parliament, spanning 24 languages. This dataset is designed to investigate fairness in a multilingual information retrieval (IR) context to analyze both language and demographic bias in a ranking context. It boasts an authentic multilingual corpus, featuring topics translated into all 24 languages, as well as cross-lingual relevance judgments. Furthermore, it offers rich demographic information associated with its documents, facilitating the study of demographic bias. We report the effectiveness of Multi-EuP for benchmarking both monolingual and multilingual IR. We also conduct a preliminary experiment on language bias caused by the choice of tokenization strategy.

PDF Abstract

Code

Add Remove Mark official

jrnlp/multi-eup official

Tasks

Add Remove

Benchmarking

Fairness

Information Retrieval

Retrieval

Text Reranking

Datasets

Introduced in the Paper:

Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information Retrieval

Results from the Paper

Edit

Ranked #1 on Text Reranking on Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information Retrieval

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Text Reranking	Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information Retrieval	BM25_whitespace_tokenizer	MRR@100	14.18	# 1		Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information Retrieval

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove