1 code implementation • 18 Apr 2024 • Kelvin Wey Han Chan, Christopher Bryant, Li Nguyen, Andrew Caines, Zheng Yuan
Through this exploration, we propose a novel method of generating synthetic CSW GEC datasets by translating different spans of text within existing GEC corpora.
no code implementations • LREC 2020 • Li Nguyen, Christopher Bryant
This paper introduces the Canberra Vietnamese-English Code-switching corpus (CanVEC), an original corpus of natural mixed speech that we semi-automatically annotated with language information, part of speech (POS) tags and Vietnamese translations.