混合多模式特征提取，采矿和融合以进行情感分析

论文标题

混合多模式特征提取，采矿和融合以进行情感分析

Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis

论文作者

Li, Jia, Zhang, Ziyang, Lang, Junjie, Jiang, Yueqi, An, Liuwei, Zou, Peng, Xu, Yangyang, Gao, Sheng, Lin, Jie, Fan, Chunxiao, Sun, Xiao, Wang, Meng

论文摘要

在本文中，我们介绍了2022年多模式情感分析挑战（MUSE）挑战的解决方案，其中包括Muse-Humor，Muse-Rection和Muse Surst Sub-Challenges。 2022年穆斯穆斯（Muse 2022）侧重于幽默检测，情感反应和多模式的情感压力，利用不同的方式和数据集。在我们的工作中，提取了不同种类的多模式特征，包括声学，视觉，文本和生物学特征。这些功能由Temma和Gru融合到自发机制框架中。在本文中，1）提取了一些新的音频功能，面部表达功能和段落级文本嵌入以进行准确的改进。 2）我们通过挖掘和融合多模式特征来实质上提高了多模式情感预测的准确性和可靠性。 3）在模型培训中应用有效的数据增强策略，以减轻样本不平衡问题并阻止模型学习有偏见的主题字符。对于博物馆 - 霍摩尔的子挑战，我们的模型获得了0.8932的AUC得分。对于Muse Rection子挑战，Pearson在测试集上的方法的相关系数为0.3879，它的表现优于所有其他参与者。对于Muse Surst Sub-Challenge，我们的方法在测试数据集上的唤醒和价值都优于基线，达到了0.5151的最终综合结果。

In this paper, we present our solutions for the Multimodal Sentiment Analysis Challenge (MuSe) 2022, which includes MuSe-Humor, MuSe-Reaction and MuSe-Stress Sub-challenges. The MuSe 2022 focuses on humor detection, emotional reactions and multimodal emotional stress utilizing different modalities and data sets. In our work, different kinds of multimodal features are extracted, including acoustic, visual, text and biological features. These features are fused by TEMMA and GRU with self-attention mechanism frameworks. In this paper, 1) several new audio features, facial expression features and paragraph-level text embeddings are extracted for accuracy improvement. 2) we substantially improve the accuracy and reliability of multimodal sentiment prediction by mining and blending the multimodal features. 3) effective data augmentation strategies are applied in model training to alleviate the problem of sample imbalance and prevent the model from learning biased subject characters. For the MuSe-Humor sub-challenge, our model obtains the AUC score of 0.8932. For the MuSe-Reaction sub-challenge, the Pearson's Correlations Coefficient of our approach on the test set is 0.3879, which outperforms all other participants. For the MuSe-Stress sub-challenge, our approach outperforms the baseline in both arousal and valence on the test dataset, reaching a final combined result of 0.5151.

下载PDF全文

下载文献需遵守相关版权规定

论文标题