no code implementations • 26 May 2024 • Cong Zhang, Derrick Goh Xin Deik, Dexun Li, Hao Zhang, Yong liu
Effective planning is crucial for the success of LLM agents in real-world tasks, making it a highly pursued topic in the community.
no code implementations • 23 Apr 2024 • Kuicai Dong, Derrick Goh Xin Deik, Yi Quan Lee, Hao Zhang, Xiangyang Li, Cong Zhang, Yong liu
As they do not consider content structures, the resultant chunks can exclude vital information or include irrelevant content.
no code implementations • 15 Feb 2024 • Dexun Li, Cong Zhang, Kuicai Dong, Derrick Goh Xin Deik, Ruiming Tang, Yong liu
We propose the Distributional Preference Reward Model (DPRM), a simple yet effective framework to align large language models with diverse human preferences.
1 code implementation • 20 Oct 2023 • Philip John Gorinski, Matthieu Zimmer, Gerasimos Lampouras, Derrick Goh Xin Deik, Ignacio Iacobacci
The advent of large pre-trained language models in the domain of Code Synthesis has shown remarkable performance on various benchmarks, treating the problem of Code Generation in a fashion similar to Natural Language Generation, trained with a Language Modelling (LM) objective.