佩里：野外意识的情感识别

论文标题

佩里：野外意识的情感识别

PERI: Part Aware Emotion Recognition In The Wild

论文作者

Mittel, Akshita, Tripathi, Shashank

论文摘要

情感识别旨在根据各种投入（包括音频，视觉和文本提示）来解释一个人的情绪状态。本文着重于使用视觉特征的情感识别。为了利用面部表情与人的情绪状态之间的相关性，开拓方法主要依赖于面部特征。但是，在自然不受约束的情况下，例如在拥挤的场景中，面部特征通常是不可靠的，因为脸部缺乏像素分辨率，并且由于遮挡和模糊而包含人工制品。为了解决这个问题，在狂野的情感中，识别利用了全身人物作物以及周围的场景环境。为了使用身体姿势进行情感识别，这种方法无法意识到面部表情（如果有）提供的潜力。因此，本文的目的是两个方面。首先，我们证明了我们的方法，以利用身体姿势和面部地标。我们通过使用从身体姿势和面部地标生成的掩码从输入图像中提取关键区域来创建部分意识的空间图像（PAS）图像。这使我们能够在可用时在面部环境外利用身体姿势。其次，从PAS图像中推理，我们引入了上下文输液（续）块。这些块会参与特定部分的信息，并将其传递到情感识别网络的中间功能上。我们的方法在概念上很简单，可以应用于任何现有的情感识别方法。我们在野生情感数据集中公开可用的结果提供了结果。与现有方法相比，Periie可以取得出色的性能，并导致情绪类别图的显着改善，同时减少价，唤醒和优势错误。重要的是，我们观察到，我们的方法在具有完全可见的面部以及遮挡面或模糊的图像中提高了两个图像的性能。

Emotion recognition aims to interpret the emotional states of a person based on various inputs including audio, visual, and textual cues. This paper focuses on emotion recognition using visual features. To leverage the correlation between facial expression and the emotional state of a person, pioneering methods rely primarily on facial features. However, facial features are often unreliable in natural unconstrained scenarios, such as in crowded scenes, as the face lacks pixel resolution and contains artifacts due to occlusion and blur. To address this, in the wild emotion recognition exploits full-body person crops as well as the surrounding scene context. In a bid to use body pose for emotion recognition, such methods fail to realize the potential that facial expressions, when available, offer. Thus, the aim of this paper is two-fold. First, we demonstrate our method, PERI, to leverage both body pose and facial landmarks. We create part aware spatial (PAS) images by extracting key regions from the input image using a mask generated from both body pose and facial landmarks. This allows us to exploit body pose in addition to facial context whenever available. Second, to reason from the PAS images, we introduce context infusion (Cont-In) blocks. These blocks attend to part-specific information, and pass them onto the intermediate features of an emotion recognition network. Our approach is conceptually simple and can be applied to any existing emotion recognition method. We provide our results on the publicly available in the wild EMOTIC dataset. Compared to existing methods, PERI achieves superior performance and leads to significant improvements in the mAP of emotion categories, while decreasing Valence, Arousal and Dominance errors. Importantly, we observe that our method improves performance in both images with fully visible faces as well as in images with occluded or blurred faces.

下载PDF全文

下载文献需遵守相关版权规定

论文标题