M2KR (Multi-task Multi-modal Knowledge Retrieval)

Introduced by Lin et al. in PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers

The M2KR is a collection of datasets designed for training and evaluating general-purpose vision-language retrievers. These datasets are released in Huggingface Dataset format and cover various retrieval tasks. Let's delve into the details:

Image to Text (I2T) retrieval: This task involves retrieving relevant textual descriptions given an input image.
Question to Text (Q2T) retrieval: Here, the goal is to retrieve relevant text passages based on a given question.
Image & Question to Text (IQ2T) retrieval: This task combines both image and question inputs to retrieve relevant textual information.

The M2KR benchmark comprises nine datasets, each tailored for specific tasks. Some of these datasets include:

WIT (Web Image Text): A dataset for I2T retrieval.
IGLUE (Image-Grounded Language Understanding Evaluation): Used for Q2T retrieval.
KVQA (Knowledge Visual Question Answering): Relevant for IQ2T retrieval.
CC3M (Common Crawl 3 Million): Another dataset for IQ2T retrieval.
OVEN (Open Vision and Language Evaluation): Used in IQ2T retrieval.
LLaVA (Large-scale Language-Visual Association): Relevant for I2T retrieval.
OKVQA (Open Knowledge Visual Question Answering): Used in IQ2T retrieval.
Infoseek: A dataset for I2T retrieval.
E-VQA (English Visual Question Answering): Relevant for IQ2T retrieval.

These datasets enable researchers to develop and evaluate vision-language models, and they play a crucial role in advancing the field of multimodal understanding and retrieval¹².

(1) M2KR Benchmark Datasets - GitHub. https://github.com/LinWeizheDragon/FLMR/blob/main/docs/Datasets.md. (2) arXiv:2402.08327v1 [cs.CL] 13 Feb 2024. https://arxiv.org/pdf/2402.08327.pdf. (3) Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers. https://preflmr.github.io/. (4) undefined. https://avatars.githubusercontent.com/u/33350454?v=4. (5) undefined. https://github.com/LinWeizheDragon/FLMR/blob/main/docs/Datasets.md?raw=true. (6) undefined. https://desktop.github.com. (7) undefined. https://github.com/LinWeizheDragon/FLMR/raw/main/docs/Datasets.md.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

M2KR (Multi-task Multi-modal Knowledge Retrieval)

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

WIT

IGLUE

InfoSeek

OVEN

Usage

License

Modalities

Languages

M2KR (Multi-task Multi-modal Knowledge Retrieval)

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

WIT

IGLUE

InfoSeek

OVEN

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages