1 code implementation • 6 Jun 2024 • Kaiyan Zhang, Sihang Zeng, Ermo Hua, Ning Ding, Zhang-Ren Chen, Zhiyuan Ma, Haoxin Li, Ganqu Cui, Biqing Qi, Xuekai Zhu, Xingtai Lv, Hu Jinfang, Zhiyuan Liu, BoWen Zhou
Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains and are moving towards more specialized areas.
no code implementations • 27 May 2024 • Biqing Qi, Junqi Gao, Kaiyan Zhang, Dong Li, Jianxing Liu, Ligang Wu, BoWen Zhou
Experiments on long-range modeling tasks in autoregressive language modeling and Long Range Arena demonstrate the general effectiveness of the SMR mechanism for a series of SSM models.
1 code implementation • 20 May 2024 • Ermo Hua, Biqing Qi, Kaiyan Zhang, Yue Yu, Ning Ding, Xingtai Lv, Kai Tian, BoWen Zhou
To obtain a unified understanding, we interpret SFT and PO with two sub-processes -- Preference Estimation and Transition Optimization -- defined at token level within the Markov Decision Process (MDP) framework.
no code implementations • 29 Mar 2024 • Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, BoWen Zhou, Jie zhou
In hallucinated cases, the output token's information rarely demonstrates abrupt increases and consistent superiority in the later stages of the model.
no code implementations • 7 Mar 2024 • Biqing Qi, Junqi Gao, Xingquan Chen, Dong Li, Jianxing Liu, Ligang Wu, BoWen Zhou
However, current EM-based methods retrieves memory globally by performing Vector-to-Vector (V2V) interaction between features corresponding to the input and prototypes stored in EM, neglecting the geometric structure of local features.
1 code implementation • 5 Mar 2024 • Biqing Qi, Xingquan Chen, Junqi Gao, Dong Li, Jianxing Liu, Ligang Wu, BoWen Zhou
Drawing on Complementary Learning System theory, this paper presents a novel Interactive Continual Learning (ICL) framework, enabled by collaborative interactions among models of various sizes.
no code implementations • 5 Mar 2024 • Kaiyan Zhang, Jianyu Wang, Ermo Hua, Biqing Qi, Ning Ding, BoWen Zhou
With the advancement of language models (LMs), their exposure to private data is increasingly inevitable, and their deployment (especially for smaller ones) on personal devices, such as PCs and smartphones, has become a prevailing trend.
no code implementations • 26 Feb 2024 • Biqing Qi, Junqi Gao, Yiang Luo, Jianxing Liu, Ligang Wu, BoWen Zhou
The rise of generative neural networks has triggered an increased demand for intellectual property (IP) protection in generated content.
no code implementations • 10 Nov 2023 • Biqing Qi, Kaiyan Zhang, Haoxiang Li, Kai Tian, Sihang Zeng, Zhang-Ren Chen, BoWen Zhou
We subsequently evaluate the hypothesis generation capabilities of various top-tier instructed models in zero-shot, few-shot, and fine-tuning settings, including both closed and open-source LLMs.
no code implementations • 24 Oct 2023 • Kaiyan Zhang, Ning Ding, Biqing Qi, Xuekai Zhu, Xinwei Long, BoWen Zhou
Instruction tuning has recently been recognized as an effective way of aligning Large Language Models (LLMs) to enhance their generalization ability across various tasks.
1 code implementation • 23 May 2023 • Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xinwei Long, Zhouhan Lin, BoWen Zhou
While large language models (LLMs) excel in various natural language processing tasks, their huge size and the inaccessibility of parameters present challenges for practical deployment.