no code implementations • COLING 2022 • Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this paper, we introduce ViNLI (Vietnamese Natural Language Inference), an open-domain and high-quality corpus for evaluating Vietnamese NLI models, which is created and evaluated with a strict process of quality control.
1 code implementation • 29 Apr 2024 • Huy Quang Pham, Thang Kien-Bao Nguyen, Quan Van Nguyen, Dan Quang Tran, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
To this end, we introduce a novel dataset, ViOCRVQA (Vietnamese Optical Character Recognition - Visual Question Answering dataset), consisting of 28, 000+ images and 120, 000+ question-answer pairs.
Optical Character Recognition Optical Character Recognition (OCR) +2
1 code implementation • 16 Apr 2024 • Quan Van Nguyen, Dan Quang Tran, Huy Quang Pham, Thang Kien-Bao Nguyen, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Visual Question Answering (VQA) is a complicated task that requires the capability of simultaneously processing natural language and images.
Multimodal Deep Learning Optical Character Recognition (OCR) +5
no code implementations • 23 Mar 2024 • Phong Nguyen-Thuan Do, Son Quoc Tran, Phu Gia Hoang, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks.
1 code implementation • 5 Feb 2024 • Thinh Phuoc Ngo, Khoa Tran Anh Dang, Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks.
no code implementations • 14 Dec 2023 • Dang Van Thin, Duong Ngoc Hao, Ngan Luu-Thuy Nguyen
The ComOM shared task aims to extract comparative opinions from product reviews in Vietnamese language.
no code implementations • 13 Dec 2023 • Nhu-Thanh Nguyen, Khoa Thi-Kim Phan, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen
Abuse in its various forms, including physical, psychological, verbal, sexual, financial, and cultural, has a negative impact on mental health.
1 code implementation • 7 May 2023 • Nghia Hieu Nguyen, Duong T. D. Vo, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
The VQA task requires methods that have the ability to fuse the information from questions and images to produce appropriate answers.
1 code implementation • 31 Mar 2023 • Son T. Luu, Khoi Trong Hoang, Tuong Quang Pham, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
From the results of the error analysis, we found the challenge of the reading comprehension models is understanding the implicit context in texts and linking them together in order to find the correct answers.
no code implementations • 16 Mar 2023 • Son Quoc Tran, Phong Nguyen-Thuan Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
From the analysis results, we suggest new directions for developing Vietnamese language models.
no code implementations • 23 Feb 2023 • Ngan Luu-Thuy Nguyen, Nghia Hieu Nguyen, Duong T. D Vo, Khanh Quoc Tran, Kiet Van Nguyen
Visual Question Answering (VQA) is a challenging task of natural language processing (NLP) and computer vision (CV), attracting significant attention from researchers.
1 code implementation • 24 Jan 2023 • Phu Gia Hoang, Canh Duc Luu, Khanh Quoc Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
The rise in hateful and offensive language directed at other users is one of the adverse side effects of the increased use of social networking platforms.
Ranked #1 on Sequence-to-sequence Language Modeling on ViHOS
no code implementations • 1 Jan 2023 • Hang Thi-Thu Le, Viet-Duc Ho, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen
The classification of answerability questions is a relatively significant sub-task in machine reading comprehension; however, there haven't been many studies.
Machine Reading Comprehension Vietnamese Machine Reading Comprehension +1
no code implementations • 1 Jan 2023 • Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen
To the best of our knowledge, this paper made the first attempt to answer whether word segmentation is necessary for Vietnamese sentiment classification.
no code implementations • 1 Jan 2023 • Quoc-Loc Duong, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen
The experimental results give conclusions about the influence and role of semantic representation on Vietnamese in understanding natural language.
Natural Language Inference Natural Language Understanding +2
1 code implementation • 15 Nov 2022 • Khiem Vinh Tran, Hao Phu Phan, Khang Nguyen Duc Quach, Ngan Luu-Thuy Nguyen, Jun Jo, Thanh Tam Nguyen
In that, we study various question types, properties, languages, and domains to provide insights on where existing systems struggle.
no code implementations • 21 Sep 2022 • Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Inspired by the success of the GLUE, we introduce the Social Media Text Classification Evaluation (SMTCE) benchmark, as a collection of datasets and models across a diverse set of SMTC tasks.
no code implementations • 20 Jun 2022 • Nhung Thi-Hong Nguyen, Phuong Phan-Dieu Ha, Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Question answering (QA) systems have gained explosive attention in recent years.
no code implementations • 14 Apr 2022 • Kiet Van Nguyen, Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models.
no code implementations • 22 Mar 2022 • Kiet Van Nguyen, Son Quoc Tran, Luan Thanh Nguyen, Tin Van Huynh, Son T. Luu, Ngan Luu-Thuy Nguyen
To address the weakness, we provide the research community with a benchmark dataset named UIT-ViQuAD 2. 0 for evaluating the MRC task and question answering systems for the Vietnamese language.
no code implementations • PACLIC 2021 • Duc-Vu Nguyen, Linh-Bao Vo, Ngoc-Linh Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Previous studies on joint Chinese word segmentation and part-of-speech tagging mainly follow the character-based tagging model focusing on modeling n-gram features.
no code implementations • 1 Oct 2021 • Duc-Vu Nguyen, Linh-Bao Vo, Dang Van Thin, Ngan Luu-Thuy Nguyen
In this paper, we propose a span labeling approach to model n-gram information for Vietnamese word segmentation, namely SPAN SEG.
no code implementations • 31 Aug 2021 • Huy Quoc To, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
Recent researches have demonstrated that BERT shows potential in a wide range of natural language processing tasks.
no code implementations • 19 May 2021 • Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
We propose a conversion algorithm to create the dataset for sentence extraction-based machine reading comprehension and three types of approaches for sentence extraction-based machine reading comprehension in Vietnamese.
1 code implementation • 4 May 2021 • Son T. Luu, Mao Nguyen Bui, Loi Duc Nguyen, Khiem Vinh Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
To help machines understand conversation texts, we present UIT-ViCoQA, a new corpus for conversational machine reading comprehension in the Vietnamese language.
no code implementations • 24 Apr 2021 • Nhung Thi-Hong Nguyen, Phuong Phan-Dieu Ha, Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Customer product reviews play a role in improving the quality of products and services for business organizations or their brands.
Complaint Comment Classification Constructive Comment Classification +2
1 code implementation • SEMEVAL 2021 • Son T. Luu, Ngan Luu-Thuy Nguyen
We present our works on SemEval-2021 Task 5 about Toxic Spans Detection.
2 code implementations • 22 Mar 2021 • Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
On social medias, hate speech has become a critical problem for social network users.
Hate Speech Detection Vietnamese Social Media Text Processing
no code implementations • 18 Mar 2021 • Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
For these tasks, we propose a system for constructive and toxic speech detection with the state-of-the-art transfer learning model in Vietnamese NLP as PhoBERT.
Constructive Comment Classification General Classification +2
no code implementations • 17 Mar 2021 • Dang Van Thin, Lac Si Le, Vu Xuan Hoang, Ngan Luu-Thuy Nguyen
In this paper, we investigate the performance of various monolingual pre-trained language models compared with multilingual models on the Vietnamese aspect category detection problem.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1
1 code implementation • 24 Feb 2021 • Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this paper, we implement this idea to improve word segmentation and part of speech tagging the Vietnamese language by employing a simplified constituency parser.
no code implementations • 21 Oct 2020 • Huy Quoc To, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
We propose a new dataset for gender prediction based on Vietnamese names.
no code implementations • 19 Oct 2020 • Tuan-Vi Tran, Xuan-Thien Pham, Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this work, we use a span-based approach for Vietnamese constituency parsing.
no code implementations • 30 Sep 2020 • Kiet Van Nguyen, Duc-Vu Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
Due to the lack of benchmark datasets for Vietnamese, we present the Vietnamese Question Answering Dataset (UIT-ViQuAD), a new dataset for the low-resource language as Vietnamese to evaluate MRC models.
no code implementations • PACLIC 2020 • Huy Duc Huynh, Hang Thi-Thuy Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
There are various studies in this field in many languages but limited to the Vietnamese language.
1 code implementation • 25 Sep 2020 • Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Thus, when collecting the data about user comments on the social network, the data is usually skewed about one label, which leads the dataset to become imbalanced and deteriorate the model's ability.
no code implementations • 7 Sep 2020 • Khiem Vinh Tran, Hao Phu Phan, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Recently, COVID-19 has affected a variety of real-life aspects of the world and led to dreadful consequences.
no code implementations • 20 Aug 2020 • Son T. Luu, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
In this paper, we conduct several experiments on neural network-based model to understand the impact of word representation to the Vietnamese multiple-choice machine reading comprehension.
no code implementations • 19 Jun 2020 • Kiet Van Nguyen, Tin Van Huynh, Duc-Vu Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
In particular, we develop a process of creating a corpus for the Vietnamese machine reading comprehension.
1 code implementation • 14 Jun 2020 • Duc-Vu Nguyen, Dang Van Thin, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this paper, we approach Vietnamese word segmentation as a binary classification by using the Support Vector Machine classifier.
3 code implementations • 1 Feb 2020 • Quan Hoang Lam, Quang Duy Le, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
This paper contributes to research on Image Captioning task in terms of extending dataset to a different language - Vietnamese.
1 code implementation • 31 Jan 2020 • Son T. Luu, Hung P. Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Consequently, we compare traditional machine learning and deep learning on a large dataset about the user's comments on social network in Vietnamese and find out what is the advantage and disadvantage of each model by comparing their accuracy on F1-score, then we pick two models in which has highest accuracy in traditional machine learning models and deep neural models respectively.
no code implementations • 16 Jan 2020 • Kiet Van Nguyen, Khiem Vinh Tran, Son T. Luu, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
Although Vietnamese is the 17th most popular native-speaker language in the world, there are not many research studies on Vietnamese machine reading comprehension (MRC), the task of understanding a text and answering questions about it.
no code implementations • 27 Dec 2019 • Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
In addition, we also proposed a simple and effective ensemble model combining different deep neural network models.
no code implementations • 21 Nov 2019 • Vong Anh Ho, Duong Huynh-Cong Nguyen, Danh Hoang Nguyen, Linh Thi-Van Pham, Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In this task, the result is not produced in terms of either polarity: positive or negative or in the form of rating (from 1 to 5) but of a more detailed level of analysis in which the results are depicted in more expressions like sadness, enjoyment, anger, disgust, fear, and surprise.
no code implementations • 17 Nov 2019 • Binh An Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In recent years, Vietnamese Named Entity Recognition (NER) systems have had a great breakthrough when using Deep Neural Network methods.
no code implementations • 17 Nov 2019 • Phu X. V. Nguyen, Tham T. T. Hong, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Student's feedback is an important source of collecting students' opinions to improve the quality of training activities.
1 code implementation • 9 Nov 2019 • Hang Thi-Thuy Do, Huy Duc Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
In this paper, we describe our system which participates in the shared task of Hate Speech Detection on Social Networks of VLSP 2019 evaluation campaign.
no code implementations • 9 Nov 2019 • Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In recent years, dependency parsing is a fascinating research topic and has a lot of applications in natural language processing.
1 code implementation • 9 Nov 2019 • Tin Van Huynh, Vu Duc Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
In recent years, Hate Speech Detection has become one of the interesting fields in natural language processing or computational linguistics.
Hate Speech Detection Vietnamese Social Media Text Processing
no code implementations • 9 Nov 2019 • Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
Dependency parsing is needed in different applications of natural language processing.
no code implementations • 30 Oct 2019 • Binh Duc Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
In Vietnamese dependency parsing, several methods have been proposed.