1 code implementation • EMNLP 2020 • Peerat Limkonchotiwat, Wannaphong Phatthiyaphaibun, Raheem Sarwar, Ekapol Chuangsuwanich, Sarana Nutanong
Like many Natural Language Processing tasks, Thai word segmentation is domain-dependent.
Ranked #1 on Thai Word Segmentation on WS160 (using extra training data)
no code implementations • Findings (EMNLP) 2021 • Nattapol Trijakwanich, Peerat Limkonchotiwat, Raheem Sarwar, Wannaphong Phatthiyaphaibun, Ekapol Chuangsuwanich, Sarana Nutanong
Cross-lingual Sentence Retrieval (CLSR) aims at retrieving parallel sentence pairs that are translations of each other from a multilingual set of comparable documents.
1 code implementation • Findings (ACL) 2022 • Weerayut Buaphet, Can Udomcharoenchaikit, Peerat Limkonchotiwat, Attapol Rutherford, Sarana Nutanong
Our work, to the best of our knowledge, presents the largest non-English N-NER dataset and the first non-English one with fine-grained classes.
1 code implementation • Findings (NAACL) 2022 • Peerat Limkonchotiwat, Wuttikorn Ponwitayarat, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, Sarana Nutanong
A common approach to CL-ReQA is to create a multilingual sentence embedding space such that question-answer pairs across different languages are close to each other.
1 code implementation • 24 Mar 2024 • Wannaphong Phatthiyaphaibun, Surapon Nonesung, Patomporn Payoungkhamdee, Peerat Limkonchotiwat, Can Udomcharoenchaikit, Jitkapat Sawatphol, Chompakorn Chaksangchaichot, Ekapol Chuangsuwanich, Sarana Nutanong
Our model is based on SEA-LION and a collection of instruction following datasets.
1 code implementation • 7 Dec 2023 • Wannaphong Phatthiyaphaibun, Korakot Chaovavanich, Charin Polpanumas, Arthit Suriyawongkul, Lalita Lowphansirikul, Pattarawat Chormai, Peerat Limkonchotiwat, Thanathip Suntorntip, Can Udomcharoenchaikit
It provides a wide range of software, models, and datasets for Thai language.
1 code implementation • 6 Nov 2023 • Peerat Limkonchotiwat, Wuttikorn Ponwitayarat, Lalita Lowphansirikul, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, Sarana Nutanong
In this paper, we propose a framework called Self-supervised Cross-View Training (SCT) to narrow the performance gap between large and small PLMs.
1 code implementation • 17 Jun 2023 • Panuthep Tasawong, Wuttikorn Ponwitayarat, Peerat Limkonchotiwat, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, Sarana Nutanong
One of the main challenges of dense retrieval in real-world settings is the handling of queries containing misspelled words.
1 code implementation • 9 Aug 2022 • Wannaphong Phatthiyaphaibun, Chompakorn Chaksangchaichot, Peerat Limkonchotiwat, Ekapol Chuangsuwanich, Sarana Nutanong
However, most of these ASR models are available in English; only a minority of the models are available in Thai.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1