no code implementations • 20 Jul 2020 • Xiangyang Mou, Brandyn Sigouin, Ian Steenstra, Hui Su
Different from purely text-based dialogue state tracking, the dialogue in AVSD contains a sequence of question-answer pairs about a video and the final answer to the given question requires additional understanding of the video.