no code implementations • 29 Apr 2024 • Dingjie Song, Shunian Chen, Guiming Hardy Chen, Fei Yu, Xiang Wan, Benyou Wang
Despite the advancements and impressive performance of Multimodal Large Language Models (MLLMs) on benchmarks, their effectiveness in real-world, long-context, and multi-image tasks is unclear due to the benchmarks' limited scope.
1 code implementation • 18 Feb 2024 • Guiming Hardy Chen, Shunian Chen, Ruifei Zhang, Junying Chen, Xiangbo Wu, Zhiyi Zhang, Zhihong Chen, Jianquan Li, Xiang Wan, Benyou Wang
Recent advancements in Large Vision-Language Models (LVLMs) have enabled processing of multimodal inputs in language models but require significant computational resources for deployment, especially in edge devices.
no code implementations • 16 Feb 2024 • Guiming Hardy Chen, Shunian Chen, Ziche Liu, Feng Jiang, Benyou Wang
Adopting human and large language models (LLM) as judges (\textit{a. k. a} human- and LLM-as-a-judge) for evaluating the performance of LLMs has recently gained attention.
1 code implementation • 23 Nov 2023 • Wentao Ge, Shunian Chen, Guiming Hardy Chen, Zhihong Chen, Junying Chen, Shuo Yan, Chenghao Zhu, Ziyue Lin, Wenya Xie, Xinyi Zhang, Yichen Chai, Xiaoyu Liu, Dingjie Song, Xidong Wang, Anningzhe Gao, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang
Multimodal large language models (MLLMs) (e. g., GPT-4V, LLaVA, and Claude-3) have broadened the scope of AI applications.
1 code implementation • 17 Aug 2023 • Xidong Wang, Guiming Hardy Chen, Dingjie Song, Zhiyi Zhang, Zhihong Chen, Qingying Xiao, Feng Jiang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li
We hope this benchmark provide first-hand experience in existing LLMs for medicine and also facilitate the widespread adoption and enhancement of medical LLMs within China.
1 code implementation • 6 Jun 2023 • Zhihong Chen, Guiming Hardy Chen, Shizhe Diao, Xiang Wan, Benyou Wang
Masked language modeling (MLM) has been one of the most popular pretraining recipes in natural language processing, e. g., BERT, one of the representative models.