no code implementations • ICON 2021 • Aloka Fernando, Gihan Dias
The word frequency list and the verified word list are the largest collections of words lists that are available for the Sinhala language.
1 code implementation • 12 Feb 2024 • Surangika Ranathunga, Nisansa de Silva, Menan Velayuthan, Aloka Fernando, Charitha Rathnayake
We conducted a detailed analysis on the quality of web-mined corpora for two low-resource languages (making three language pairs, English-Sinhala, English-Tamil and Sinhala-Tamil).
no code implementations • 18 May 2022 • Aloka Fernando, Surangika Ranathunga
However, existing DA techniques have addressed only one of these OOV types and limit to considering either syntactic constraints or semantic constraints.
no code implementations • 5 Nov 2020 • Aloka Fernando, Surangika Ranathunga, Gihan Dias
This paper focuses on data augmentation techniques where bilingual lexicon terms are expanded based on case-markers with the objective of generating new words, to be used in Statistical machine Translation (SMT).