Avicenna: a challenge dataset for natural language generation toward commonsense syllogistic reasoning

Syllogism is a type of everyday reasoning. For instance, given that "Avicenna wrote the famous book the Canon of Medicine " and " The Canon of Medicine has influenced modern medicine," it can be concluded that "Avicenna has influenced modern medicine." This study revolves around syllogistic natural language generation (NLG). The Avicenna corpus was developed as a benchmark for syllogistic NLG. In this respect, once the syllogistic relation between two premises is recognized, the Avicenna-trained models learn to generate the conclusion sentence (which is semantically unique). The experiments were performed using state-of-the-art pre-trained text generative models (TGMs). The state-of-the-art baseline was provided, and the accuracy was improved up to 32% when transfer learning was adopted. The models were evaluated using human and automatic procedures. The model’s confusion in detecting the middle-term (the duplicate part with the same meaning in the premises) was one of the main categories of errors that showed up in the error analysis. This issue indicates that the model learns how to extract new facts based on the premises, but it faces a challenge in commonsense reasoning. The outcomes of the study demonstrated that although TGMs are significantly powerful, they do not yet have sufficient reasoning capabilities to generate text, based on commonsense knowledge. The Avicenna dataset poses a new challenge of commonsense inference that is easy for humans (98.1%) while difficult for state-of-the-art TGMs (32%).

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here