论文标题

WSSS中的阈值事项:操纵针对阈值的稳健和精确分割模型的激活

Threshold Matters in WSSS: Manipulating the Activation for the Robust and Accurate Segmentation Model Against Thresholds

论文作者

Lee, Minhyun, Kim, Dongseob, Shim, Hyunjung

论文摘要

弱监督的语义分割(WSSS)最近因其仅使用图像级标签训练分割模型的承诺而引起了很多关注。现有的WSSS方法通常认为,CAM的稀疏覆盖范围会导致WSSS的性能瓶颈。本文提供了分析和经验证据,表明实际的瓶颈可能不是稀疏的覆盖范围,而是CAM后采用的全球阈值方案。然后,我们证明可以通过满足两个条件来缓解这个问题。 1)减少前景激活中的不平衡,2)增加前景和背景激活之间的差距。基于这些发现,我们提出了一个新型的激活操纵网络,该网络具有每个像素分类损失和标签调节模块。人均分类自然会在激活图中诱导两级激活,这可以惩罚最判别的部分,促进较小的判别部分并停用背景区域。标签条件施加了伪遮罩的输出标签,应是真正的图像级标签。它惩罚了分配给非目标类的错误激活。基于广泛的分析和评估,我们证明了每个组件有助于产生准确的伪面罩,从而实现了与全球阈值选择的鲁棒性。最后,我们的模型在Pascal VOC 2012和MS Coco 2014数据集上都达到了最先进的记录。

Weakly-supervised semantic segmentation (WSSS) has recently gained much attention for its promise to train segmentation models only with image-level labels. Existing WSSS methods commonly argue that the sparse coverage of CAM incurs the performance bottleneck of WSSS. This paper provides analytical and empirical evidence that the actual bottleneck may not be sparse coverage but a global thresholding scheme applied after CAM. Then, we show that this issue can be mitigated by satisfying two conditions; 1) reducing the imbalance in the foreground activation and 2) increasing the gap between the foreground and the background activation. Based on these findings, we propose a novel activation manipulation network with a per-pixel classification loss and a label conditioning module. Per-pixel classification naturally induces two-level activation in activation maps, which can penalize the most discriminative parts, promote the less discriminative parts, and deactivate the background regions. Label conditioning imposes that the output label of pseudo-masks should be any of true image-level labels; it penalizes the wrong activation assigned to non-target classes. Based on extensive analysis and evaluations, we demonstrate that each component helps produce accurate pseudo-masks, achieving the robustness against the choice of the global threshold. Finally, our model achieves state-of-the-art records on both PASCAL VOC 2012 and MS COCO 2014 datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源