Search Results for author: Oddur Kjartansson

Found 9 papers, 2 papers with code

Data Cards: Purposeful and Transparent Dataset Documentation for Responsible AI

1 code implementation • 3 Apr 2022 • Mahima Pushkarna, Andrew Zaldivar, Oddur Kjartansson

In this paper, we propose Data Cards for fostering transparent, purposeful and human-centered documentation of datasets within the practical contexts of industry and research.

159

Paper
Code

Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure

no code implementations • 23 Oct 2020 • Ben Hutchinson, Andrew Smart, Alex Hanna, Emily Denton, Christina Greer, Oddur Kjartansson, Parker Barnes, Margaret Mitchell

In this paper, we introduce a rigorous framework for dataset development transparency which supports decision-making and accountability.

BIG-bench Machine Learning Decision Making

Paper
Add Code

Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An Overview

1 code implementation • 14 Oct 2020 • Alena Butryna, Shan-Hui Cathy Chu, Isin Demirsahin, Alexander Gutkin, Linne Ha, Fei He, Martin Jansche, Cibu Johny, Anna Katanova, Oddur Kjartansson, Chenfang Li, Tatiana Merkulova, Yin May Oo, Knot Pipatsrisawat, Clara Rivera, Supheakmungkol Sarin, Pasindu De Silva, Keshan Sodimana, Richard Sproat, Theeraphol Wattanavekin, Jaka Aris Eko Wibawa

This paper presents an overview of a program designed to address the growing need for developing freely available speech resources for under-represented languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

365

Paper
Code

Open-Source High Quality Speech Datasets for Basque, Catalan and Galician

no code implementations • LREC 2020 • Oddur Kjartansson, Alex Gutkin, er, Alena Butryna, Isin Demirsahin, Clara Rivera

This paper introduces new open speech datasets for three of the languages of Spain: Basque, Catalan and Galician.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Burmese Speech Corpus, Finite-State Text Normalization and Pronunciation Grammars with an Application to Text-to-Speech

no code implementations • LREC 2020 • Yin May Oo, Theeraphol Wattanavekin, Chenfang Li, Pasindu De Silva, Supheakmungkol Sarin, Knot Pipatsrisawat, Martin Jansche, Oddur Kjartansson, Alex Gutkin, er

This paper introduces an open-source crowd-sourced multi-speaker speech corpus along with the comprehensive set of finite-state transducer (FST) grammars for performing text normalization for the Burmese (Myanmar) language.

Paper
Add Code

Open-source Multi-speaker Corpora of the English Accents in the British Isles

no code implementations • LREC 2020 • Isin Demirsahin, Oddur Kjartansson, Alex Gutkin, er, Clara Rivera

This paper presents a dataset of transcribed high-quality audio of English sentences recorded by volunteers speaking with different accents of the British Isles.

Paper
Add Code

Open-source Multi-speaker Speech Corpora for Building Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu Speech Synthesis Systems

no code implementations • LREC 2020 • Fei He, Shan-Hui Cathy Chu, Oddur Kjartansson, Clara Rivera, Anna Katanova, Alex Gutkin, er, Isin Demirsahin, Cibu Johny, Martin Jansche, Supheakmungkol Sarin, Knot Pipatsrisawat

We present free high quality multi-speaker speech corpora for Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu, which are six of the twenty two official languages of India spoken by 374 million native speakers.

Speech Synthesis

Paper
Add Code

Crowdsourcing Latin American Spanish for Low-Resource Text-to-Speech

no code implementations • LREC 2020 • Adriana Guevara-Rukoz, Isin Demirsahin, Fei He, Shan-Hui Cathy Chu, Supheakmungkol Sarin, Knot Pipatsrisawat, Alex Gutkin, er, Alena Butryna, Oddur Kjartansson

In this paper we present a multidialectal corpus approach for building a text-to-speech voice for a new dialect in a language with existing resources, focusing on various South American dialects of Spanish.

Paper
Add Code

Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech

no code implementations • LREC 2018 • Jaka Aris Eko Wibawa, Supheakmungkol Sarin, Chenfang Li, Knot Pipatsrisawat, Keshan Sodimana, Oddur Kjartansson, Alex Gutkin, er, Martin Jansche, Linne Ha

Automatic Speech Recognition (ASR) Speech Synthesis

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.