no code implementations • Findings (ACL) 2022 • Yiqun Yao, Rada Mihalcea
Moreover, for different modalities, the best unimodal models may work under significantly different learning rates due to the nature of the modality and the computational flow of the model; thus, selecting a global learning rate for late-fusion models can result in a vanishing gradient for some modalities.
no code implementations • 25 Apr 2024 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang
Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications.
no code implementations • 4 Mar 2024 • Zhenru Lin, Yiqun Yao, Yang Yuan
Large language models (LLMs) such as ChatGPT are increasingly proficient in understanding and generating a mixture of code and text.
no code implementations • 7 Sep 2023 • Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin, Zheng Zhang, Aixin Sun, Yequan Wang
We demonstrate that a 101B-parameter LLM with 0. 31T tokens can be trained with a budget of 100K US dollars.
1 code implementation • 4 May 2023 • Yiqun Yao, Zheng Zhang, Jing Li, Yequan Wang
In terms of growth schedule, the impact of each single dimension on a schedule's efficiency is under-explored by existing work.
1 code implementation • 14 Apr 2023 • Yiqun Yao, Siqi Fan, Xiusheng Huang, Xuezhi Fang, Xiang Li, Ziyi Ni, Xin Jiang, Xuying Meng, Peng Han, Shuo Shang, Kang Liu, Aixin Sun, Yequan Wang
With around 14% of the one-time pre-training cost, we can accurately forecast the loss for models up to 52B.
no code implementations • NAACL 2021 • Yiqun Yao, Michalis Papakostas, Mihai Burzo, Mohamed Abouelenien, Rada Mihalcea
The capability to automatically detect human stress can benefit artificial intelligent agents involved in affective computing and human-computer interaction.
no code implementations • NAACL 2019 • Yiqun Yao, Jiaming Xu, Bo Xu
Visual Dialog is a multi-modal task that requires a model to participate in a multi-turn human dialog grounded on an image, and generate correct, human-like responses.
no code implementations • 15 Nov 2018 • Jing Shi, Jiaming Xu, Yiqun Yao, Bo Xu
In this paper, we present a memory-augmented neural network which is motivated by the process of human concept learning.
1 code implementation • EMNLP 2018 • Yiqun Yao, Jiaming Xu, Feng Wang, Bo Xu
Our code is available at https://github. com/FlamingHorizon/CMM-VR.
no code implementations • WS 2016 • Jing Shi, Jiaming Xu, Yiqun Yao, Suncong Zheng, Bo Xu
As the result of the evaluation shows, our solution provides a valuable and brief model which could be used in modelling question answering or sentence semantic relevance.
1 code implementation • COLING 2016 • Jiaming Xu, Jing Shi, Yiqun Yao, Suncong Zheng, Bo Xu
Recently, end-to-end memory networks have shown promising results on Question Answering task, which encode the past facts into an explicit memory and perform reasoning ability by making multiple computational steps on the memory.