CSAT‑FTCN: A Fuzzy‑Oriented Model with Contextual Self‑attention Network for Multimodal Emotion Recognition

Multimodal emotion analysis has become a hot trend because of its wide applications, such as the question-answering system. However, in a real-world scenario, people usually have mixed or partial emotions about evaluating objects. In this paper, we introduce a fuzzy temporal convolutional network based on contextual self-attention (CSAT-FTCN) to address these challenges, which has a membership function modeling various fuzzy emotions for understanding emotions in a more profound sense. Moreover, the CSAT-FTCN can obtain the dependency relationships of target utterances on internal own key information and external contextual information to understand emotions in a more profound sense. Additionally, as for multi-modality data, we introduce an attention fusion (ATF) mechanism to capture the dependency relationship between different modality information. The experimental results show that our CSAT-FTCN outperforms state-of-the-art models on tested datasets. The CSAT-FTCN network provides a novel method for multimodal emotion analysis.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here