论文标题
当面部表情识别符合很少的学习时:联合和替代学习框架
When Facial Expression Recognition Meets Few-Shot Learning: A Joint and Alternate Learning Framework
论文作者
论文摘要
人类情绪涉及基本和复合的面部表情。但是,当前对面部表达识别(FER)的研究主要集中在基本表达上,因此在实际情况下无法解决人类情绪的多样性。同时,现有在化合物FER上的工作在很大程度上取决于标有的大量复合表达训练数据,这些数据通常是在心理学专业指导下费力地收集的。在本文中,我们在跨域几乎没有学习设置中研究化合物FER,其中只需要几个来自目标域的新型类别的图像作为参考。特别是,我们旨在通过在易于访问的基本表达数据集上训练的模型来确定看不见的复合表达式。为了减轻我们的FER任务中有限基础类别的问题,我们提出了一个新颖的情感引导性相似性网络(例如NET),该网络由情感分支和相似性分支组成,基于两个阶段的学习框架。具体而言,在第一阶段,相似性分支以多任务的方式与情感分支共同训练。随着情绪分支的正则化,我们防止相似性分支过度拟合到采样的基类,这些基础类别在不同的情节中高度重叠。在第二阶段,情感分支和相似性分支在互相学习的“两学生游戏”中,从而进一步提高了在看不见的复合表达式上相似性分支的推理能力。对实验室和野外复合表达数据集的实验结果证明了我们提出的方法与几种最新方法的优越性。
Human emotions involve basic and compound facial expressions. However, current research on facial expression recognition (FER) mainly focuses on basic expressions, and thus fails to address the diversity of human emotions in practical scenarios. Meanwhile, existing work on compound FER relies heavily on abundant labeled compound expression training data, which are often laboriously collected under the professional instruction of psychology. In this paper, we study compound FER in the cross-domain few-shot learning setting, where only a few images of novel classes from the target domain are required as a reference. In particular, we aim to identify unseen compound expressions with the model trained on easily accessible basic expression datasets. To alleviate the problem of limited base classes in our FER task, we propose a novel Emotion Guided Similarity Network (EGS-Net), consisting of an emotion branch and a similarity branch, based on a two-stage learning framework. Specifically, in the first stage, the similarity branch is jointly trained with the emotion branch in a multi-task fashion. With the regularization of the emotion branch, we prevent the similarity branch from overfitting to sampled base classes that are highly overlapped across different episodes. In the second stage, the emotion branch and the similarity branch play a "two-student game" to alternately learn from each other, thereby further improving the inference ability of the similarity branch on unseen compound expressions. Experimental results on both in-the-lab and in-the-wild compound expression datasets demonstrate the superiority of our proposed method against several state-of-the-art methods.