1 code implementation • 20 Oct 2023 • Andrea Sottana, Bin Liang, Kai Zou, Zheng Yuan
Large Language Models (LLMs) evaluation is a patchy and inconsistent landscape, and it is becoming clear that the quality of automatic evaluation metrics is not keeping up with the pace of development of generative models.