Search Results for author: Maud Ehrmann

Found 11 papers, 2 papers with code

Named Entity Recognition and Classification on Historical Documents: A Survey

1 code implementation23 Sep 2021 Maud Ehrmann, Ahmed Hamdi, Elvys Linhares Pontes, Matteo Romanello, Antoine Doucet

After decades of massive digitisation, an unprecedented amount of historical documents is available in digital format, along with their machine-readable texts.

Classification named-entity-recognition +2

Language Resources for Historical Newspapers: the Impresso Collection

no code implementations LREC 2020 Maud Ehrmann, Matteo Romanello, Simon Clematide, Phillip Benjamin Str{\"o}bel, Rapha{\"e}l Barman

If this represents a huge step forward in terms of preservation and accessibility, the next fundamental challenge{--} and real promise of digitization{--} is to exploit the contents of these digital assets, and therefore to adapt and develop appropriate language technologies to search and retrieve information from this {`}Big Data of the Past{'}.

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

3 code implementations14 Feb 2020 Raphaël Barman, Maud Ehrmann, Simon Clematide, Sofia Ares Oliveira, Frédéric Kaplan

The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration.

Document Layout Analysis Semantic Segmentation

Media monitoring and information extraction for the highly inflected agglutinative language Hungarian

no code implementations LREC 2014 J{\'u}lia Pajzs, Ralf Steinberger, Maud Ehrmann, Mohamed Ebrahim, Leonida della Rocca, Stefano Bucci, Eszter Simon, Tam{\'a}s V{\'a}radi

In this paper, we describe the effort of adding to EMM Hungarian text mining tools for news gathering; document categorisation; named entity recognition and classification for persons, organisations and locations; name lemmatisation; quotation recognition; and cross-lingual linking of related news clusters.

Information Retrieval named-entity-recognition +2

Acronym recognition and processing in 22 languages

no code implementations RANLP 2013 Maud Ehrmann, Leonida della Rocca, Ralf Steinberger, Hristo Tanev

We are presenting work on recognising acronyms of the form Long-Form (Short-Form) such as "International Monetary Fund (IMF)" in millions of news articles in twenty-two languages, as part of our more general effort to recognise entities and their variants in news text and to use them for the automatic analysis of the news, including the linking of related news across languages.

Cannot find the paper you are looking for? You can Submit a new open access paper.