no code implementations • EMNLP 2021 • Huimin Wang, Kam-Fai Wong
Most reinforcement learning methods for dialog policy learning train a centralized agent that selects a predefined joint action concatenating domain name, intent type, and slot name.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 21 Feb 2024 • Hongru Wang, Boyang Xue, Baohang Zhou, Tianhua Zhang, Cunxiang Wang, Guanhua Chen, Huimin Wang, Kam-Fai Wong
Retrieve-then-read and generate-then-read are two typical solutions to handle unknown and known questions in open-domain question-answering, while the former retrieves necessary external knowledge and the later prompt the large language models to generate internal known knowledge encoded in the parameters.
no code implementations • 28 Sep 2023 • Hongru Wang, Huimin Wang, Lingzhi Wang, Minda Hu, Rui Wang, Boyang Xue, Hongyuan Lu, Fei Mi, Kam-Fai Wong
Large language models (LLMs) have demonstrated exceptional performance in planning the use of various functional tools, such as calculators and retrievers, particularly in question-answering tasks.
no code implementations • 5 Sep 2023 • Huimin Wang, Wai-Chung Kwan, Kam-Fai Wong
Recent works usually address Dialog policy learning DPL by training a reinforcement learning (RL) agent to determine the best dialog action.
1 code implementation • 1 Sep 2023 • Wai-Chung Kwan, Huimin Wang, Hongru Wang, Zezhong Wang, Xian Wu, Yefeng Zheng, Kam-Fai Wong
In addition, JoTR employs reinforcement learning with a reward-shaping mechanism to efficiently finetune the word-level dialogue policy, which allows the model to learn from its interactions, improving its performance over time.
1 code implementation • 17 Jul 2023 • Huimin Wang, Wai-Chung Kwan, Kam-Fai Wong, Yefeng Zheng
Automatic diagnosis (AD), a critical application of AI in healthcare, employs machine learning techniques to assist doctors in gathering patient symptom information for precise disease diagnosis.
1 code implementation • 25 May 2023 • Zhiming Mao, Huimin Wang, Yiming Du, Kam-Fai Wong
Moreover, conditioned on user history encoded by Transformer encoders, our framework leverages Transformer decoders to estimate the language perplexity of candidate text items, which can serve as a straightforward yet significant contrastive signal for user-item text matching.
no code implementations • 28 Feb 2022 • Wai-Chung Kwan, Hongru Wang, Huimin Wang, Kam-Fai Wong
In this paper, we survey recent advances and challenges in dialogue policy from the prescriptive of RL.
no code implementations • 2 Nov 2021 • Hongru Wang, Huimin Wang, Zezhong Wang, Kam-Fai Wong
Reinforcement Learning (RL) has been witnessed its potential for training a dialogue policy agent towards maximizing the accumulated rewards given from users.
no code implementations • ACL 2020 • Huimin Wang, Baolin Peng, Kam-Fai Wong
Training a task-oriented dialogue agent with reinforcement learning is prohibitively expensive since it requires a large volume of interactions with users.