TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Zero-Shot Cross-Lingual Transfer	XTREME	T-ULRv2 + StableTune	Sentence-pair Classification	88.8	# 13
Zero-Shot Cross-Lingual Transfer	XTREME	T-ULRv2 + StableTune	Structured Prediction	75.4	# 15
Zero-Shot Cross-Lingual Transfer	XTREME	T-ULRv2 + StableTune	Question Answering	72.9	# 12
Zero-Shot Cross-Lingual Transfer	XTREME	T-ULRv2 + StableTune	Sentence Retrieval	89.3	# 16
Zero-Shot Cross-Lingual Transfer	XTREME	T-ULRv2 + StableTune	Avg	80.7	# 16

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/infoxlm-an-information-theoretic-framework/zero-shot-cross-lingual-transfer-on-xtreme)](https://paperswithcode.com/sota/zero-shot-cross-lingual-transfer-on-xtreme?p=infoxlm-an-information-theoretic-framework)`

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

NAACL 2021 · Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, He-Yan Huang, Ming Zhou ·

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts. The unified view helps us to better understand the existing methods for learning cross-lingual representations. More importantly, inspired by the framework, we propose a new pre-training task based on contrastive learning. Specifically, we regard a bilingual sentence pair as two views of the same meaning and encourage their encoded representations to be more similar than the negative examples. By leveraging both monolingual and parallel corpora, we jointly train the pretext tasks to improve the cross-lingual transferability of pre-trained models. Experimental results on several benchmarks show that our approach achieves considerably better performance. The code and pre-trained models are available at https://aka.ms/infoxlm.

PDF Abstract NAACL 2021 PDF NAACL 2021 Abstract

Code

Add Remove Mark official

microsoft/unilm

18,431

CZWin32768/xnlg

128

facebookresearch/data2vec_vision

jiamingkong/infoxlm_paddle

Tasks

Add Remove

Contrastive Learning

Cross-Lingual Transfer

Language Modelling

Sentence

Datasets

XNLI

MLQA CC100

XTREME

Results from the Paper

Add Remove

Ranked #16 on Zero-Shot Cross-Lingual Transfer on XTREME

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Zero-Shot Cross-Lingual Transfer	XTREME	T-ULRv2 + StableTune	Sentence-pair Classification	88.8	# 13	Compare
			Structured Prediction	75.4	# 15	Compare
			Question Answering	72.9	# 12	Compare
			Sentence Retrieval	89.3	# 16	Compare
			Avg	80.7	# 16	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Contrastive Multiview Coding • Dense Connections • Dropout • InfoNCE • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove