The LogiEval dataset is a benchmark suite designed for evaluating the logical reasoning abilities of prompt-based language models, particularly instruct-prompt large language models. Here are some key details about LogiEval:
The dataset was developed by researchers to address the need for robust logical reasoning evaluation.
Contents:
The dataset includes 8,678 QA instances sourced from expert-written questions.
Usage:
To utilize LogiEval, one can follow the instructions provided in the repository, including setting up the necessary environment and running evaluations.
Citation:
In summary, LogiEval provides a valuable resource for assessing logical reasoning abilities in prompt-based language models. Researchers can use it to evaluate and compare different models' performance in logical reasoning tasks.
Source: Conversation with Bing, 3/18/2024 (1) Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4 - arXiv.org. https://arxiv.org/pdf/2304.03439.pdf. (2) GitHub - csitfun/LogiEval: a benchmark suite for testing logical .... https://github.com/csitfun/LogiEval. (3) [2007.08124] LogiQA: A Challenge Dataset for Machine Reading .... https://arxiv.org/abs/2007.08124. (4) [2203.15099] LogicInference: A New Dataset for Teaching Logical .... https://arxiv.org/abs/2203.15099. (5) Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4. https://arxiv.org/abs/2304.03439.
Paper | Code | Results | Date | Stars |
---|