TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Handwritten Text Recognition	Belfort	PyLaia (human transcriptions + random split)	CER (%)	10.54	# 4
Handwritten Text Recognition	Belfort	PyLaia (human transcriptions + random split)	WER (%)	28.11	# 4
Handwritten Text Recognition	Belfort	PyLaia (human transcriptions + agreement-based split)	CER (%)	5.57	# 3
Handwritten Text Recognition	Belfort	PyLaia (human transcriptions + agreement-based split)	WER (%)	19.12	# 3
Handwritten Text Recognition	Belfort	PyLaia (all transcriptions + agreement-based split)	CER (%)	4.34	# 1
Handwritten Text Recognition	Belfort	PyLaia (all transcriptions + agreement-based split)	WER (%)	15.14	# 1
Handwritten Text Recognition	Belfort	PyLaia (rover consensus + agreement-based split)	CER (%)	4.95	# 2
Handwritten Text Recognition	Belfort	PyLaia (rover consensus + agreement-based split)	WER (%)	17.08	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/handwritten-text-recognition-from-1/handwritten-text-recognition-on-belfort)](https://paperswithcode.com/sota/handwritten-text-recognition-on-belfort?p=handwritten-text-recognition-from-1)`

Handwritten Text Recognition from Crowdsourced Annotations

International Workshop on Historical Document Imaging and Processing 2023 · Solène Tarride, Tristan Faine, Mélodie Boillet, Harold Mouchère, Christopher Kermorvant ·

In this paper, we explore different ways of training a model for handwritten text recognition when multiple imperfect or noisy transcriptions are available. We consider various training configurations, such as selecting a single transcription, retaining all transcriptions, or computing an aggregated transcription from all available annotations. In addition, we evaluate the impact of quality-based data selection, where samples with low agreement are removed from the training set. Our experiments are carried out on municipal registers of the city of Belfort (France) written between 1790 and 1946. % results The results show that computing a consensus transcription or training on multiple transcriptions are good alternatives. However, selecting training samples based on the degree of agreement between annotators introduces a bias in the training data and does not improve the results. Our dataset is publicly available on Zenodo: https://zenodo.org/record/8041668.

PDF Abstract International Workshop 2023 PDF

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Handwritten Text Recognition

Datasets

Belfort

Results from the Paper

Add Remove

Ranked #1 on Handwritten Text Recognition on Belfort

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Handwritten Text Recognition	Belfort	PyLaia (human transcriptions + random split)	CER (%)	10.54	# 4	Compare
Handwritten Text Recognition	Belfort	PyLaia (human transcriptions + random split)	WER (%)	28.11	# 4	Compare
Handwritten Text Recognition	Belfort	PyLaia (human transcriptions + agreement-based split)	CER (%)	5.57	# 3	Compare
Handwritten Text Recognition	Belfort	PyLaia (human transcriptions + agreement-based split)	WER (%)	19.12	# 3	Compare
Handwritten Text Recognition	Belfort	PyLaia (all transcriptions + agreement-based split)	CER (%)	4.34	# 1	Compare
Handwritten Text Recognition	Belfort	PyLaia (all transcriptions + agreement-based split)	WER (%)	15.14	# 1	Compare
Handwritten Text Recognition	Belfort	PyLaia (rover consensus + agreement-based split)	CER (%)	4.95	# 2	Compare
Handwritten Text Recognition	Belfort	PyLaia (rover consensus + agreement-based split)	WER (%)	17.08	# 2	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Handwritten Text Recognition from Crowdsourced Annotations

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove