no code implementations • 1 Mar 2024 • Polina Tsvilodub, Hening Wang, Sharon Grosch, Michael Franke
This paper systematically compares different methods of deriving item-level predictions of language models for multiple-choice tasks.
Multiple-choice