Reddit Engagement Dataset (RED), a distant-supervision set, with 80k single-turn conversations. RED is sourced from Reddit, sampling from 43 popular subreddits, and processed from a total of 5 million posts, filtering out data that was either non-conversational, toxic, or posts not possible to ascertain popularity.
Source: EnDex: Evaluation of Dialogue Engagingness at ScalePaper | Code | Results | Date | Stars |
---|