SENTIMENT ANALYSIS MODEL OF MULTI-LEVEL ATTENTION CROSS-MODAL SELF-ADAPTIVE FUSION
-
Abstract
The research of sentiment analysis for video is less compared with the research of sentiment analysis for text and image, and the extraction of cross-modal relationships between different modal still has the problems of noise and information redundancy. Therefore, this study proposes a text and video sentiment analysis model of multi-level attention cross-modal self-adaptive fusion (MACSF). The extracted text and video features were fused twice under multi-head hierarchical attention (MHA), to obtain the secondary fusion features with interactive semantics. The text features and the secondary fusion features were sent into the self-adaptive cross-modal integration to obtain the final fusion features. The fusion features were inputted into the multi-layer perceptron and Softmax function to obtain the sentiment classification results. Experiments on public dataset MOSI and MOSEI show that the model in this paper makes up for the noise problem in cross-modal interaction and improves the effect of sentiment analysis effectively.
-
-