论文标题
标签辅助多模式情绪分析在不确定的缺失方式下
Tag-assisted Multimodal Sentiment Analysis under Uncertain Missing Modalities
论文作者
论文摘要
在所有模式都可用的假设下,已经研究了多模式情感分析。但是,这种强大的假设并不总是在实践中存在,并且当缺少部分模式时,大多数多模式融合模型可能会失败。几项工作解决了缺失的模态问题。但是,他们中的大多数仅考虑了单个模式缺失的案例,而忽略了多种方式缺失的更一般的案例。为此,在本文中,我们提出了一个标签辅助的变压器编码器(TATE)网络,以处理缺失不确定模式的问题。具体来说,我们设计了一个编码模块的标签,以涵盖单个模态和多种模式缺失的情况,以指导网络对那些缺失的方式的关注。此外,我们采用新的空间投影模式来对齐通用向量。然后,使用变压器编码器网络来学习缺少的模态特征。最后,变压器编码器的输出用于最终情感分类。在CMU-MOSI和IEMOCAP数据集上进行了广泛的实验,这表明我们的方法可以与几个基线相比获得显着改进。
Multimodal sentiment analysis has been studied under the assumption that all modalities are available. However, such a strong assumption does not always hold in practice, and most of multimodal fusion models may fail when partial modalities are missing. Several works have addressed the missing modality problem; but most of them only considered the single modality missing case, and ignored the practically more general cases of multiple modalities missing. To this end, in this paper, we propose a Tag-Assisted Transformer Encoder (TATE) network to handle the problem of missing uncertain modalities. Specifically, we design a tag encoding module to cover both the single modality and multiple modalities missing cases, so as to guide the network's attention to those missing modalities. Besides, we adopt a new space projection pattern to align common vectors. Then, a Transformer encoder-decoder network is utilized to learn the missing modality features. At last, the outputs of the Transformer encoder are used for the final sentiment classification. Extensive experiments are conducted on CMU-MOSI and IEMOCAP datasets, showing that our method can achieve significant improvements compared with several baselines.