1 code implementation • 13 Jun 2022 • Yaru Hao, Haoyu Song, Li Dong, Shaohan Huang, Zewen Chi, Wenhui Wang, Shuming Ma, Furu Wei
Experimental results across various language-only and vision-language benchmarks show that our model outperforms or is competitive with specialized models on finetuning, zero-shot generalization, and few-shot learning.
Ranked #2 on Image Captioning on nocaps val
1 code implementation • 20 May 2022 • Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei
With the visually-augmented context, VaLM uses a visual knowledge fusion layer to enable multimodal grounded language modeling by attending to both text context and visual knowledge in images.
no code implementations • ACL 2022 • Haoyu Song, Li Dong, Wei-Nan Zhang, Ting Liu, Furu Wei
We first evaluate CLIP's zero-shot performance on a typical visual question answering task and demonstrate a zero-shot cross-modality transfer capability of CLIP on the visual entailment task.
1 code implementation • ACL 2021 • Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-Nan Zhang, Ting Liu
Maintaining consistent personas is essential for dialogue agents.
1 code implementation • EMNLP 2020 • Haoyu Song, Yan Wang, Wei-Nan Zhang, Zhengyu Zhao, Ting Liu, Xiaojiang Liu
Maintaining a consistent attribute profile is crucial for dialogue agents to naturally converse with humans.
no code implementations • ACL 2020 • Haoyu Song, Yan Wang, Wei-Nan Zhang, Xiaojiang Liu, Ting Liu
Maintaining a consistent personality in conversations is quite natural for human beings, but is still a non-trivial task for machines.
1 code implementation • 14 Nov 2019 • Haoyu Song, Wei-Nan Zhang, Jingwen Hu, Ting Liu
Consistency is one of the major challenges faced by dialogue agents.
1 code implementation • 29 May 2019 • Haoyu Song, Wei-Nan Zhang, Yiming Cui, Dong Wang, Ting Liu
Giving conversational context with persona information to a chatbot, how to exploit the information to generate diverse and sustainable conversations is still a non-trivial task.