通过识别和标记在域移动下的语义分割的上下文各种类别来减少注释工作

论文标题

通过识别和标记在域移动下的语义分割的上下文各种类别来减少注释工作

Reducing Annotation Effort by Identifying and Labeling Contextually Diverse Classes for Semantic Segmentation Under Domain Shift

论文作者

Agarwal, Sharat, Anand, Saket, Arora, Chetan

论文摘要

在Active域适应（ADA）中，一个人使用主动学习（AL）从目标域中选择图像的子集，然后将其注释并用于监督域适应（DA）。鉴于受监督和无监督的DA技术之间的性能差距很大，ADA允许在注释成本和性能之间进行良好的权衡。先前的艺术利用模型的不确定性或分歧的度量来识别人类甲骨文注释的“区域”。但是，这些区域经常由对象边界处的像素组成，这些区域艰难而乏味。因此，即使注释的图像像素的比例减少了，总体注释时间和结果成本仍然很高。在这项工作中，我们提出了一个ADA策略，该策略给出了框架，它标识了一组最难准确预测的类，从而建议在选定的框架中注释语义上有意义的区域。我们表明，这些“硬”类集与上下文有关，通常在框架上有所不同，当注释有助于模型更好地推广。我们提出了两种ADA技术：基于锚的基于锚和增强的方法，在当前培训集的背景下选择互补和不同地区。我们的方法在GTA上达到了66.6 MIOU至CITYSCAPES数据集，使用5％的注释与MADA相比，注释预算为4.7％。我们的技术也可以用作任何基于框架的AL技术的装饰器，例如，我们使用我们的方法报告了CDAL的1.5％的CDAL性能提高。

In Active Domain Adaptation (ADA), one uses Active Learning (AL) to select a subset of images from the target domain, which are then annotated and used for supervised domain adaptation (DA). Given the large performance gap between supervised and unsupervised DA techniques, ADA allows for an excellent trade-off between annotation cost and performance. Prior art makes use of measures of uncertainty or disagreement of models to identify `regions' to be annotated by the human oracle. However, these regions frequently comprise of pixels at object boundaries which are hard and tedious to annotate. Hence, even if the fraction of image pixels annotated reduces, the overall annotation time and the resulting cost still remain high. In this work, we propose an ADA strategy, which given a frame, identifies a set of classes that are hardest for the model to predict accurately, thereby recommending semantically meaningful regions to be annotated in a selected frame. We show that these set of `hard' classes are context-dependent and typically vary across frames, and when annotated help the model generalize better. We propose two ADA techniques: the Anchor-based and Augmentation-based approaches to select complementary and diverse regions in the context of the current training set. Our approach achieves 66.6 mIoU on GTA to Cityscapes dataset with an annotation budget of 4.7% in comparison to 64.9 mIoU by MADA using 5% of annotations. Our technique can also be used as a decorator for any existing frame-based AL technique, e.g., we report 1.5% performance improvement for CDAL on Cityscapes using our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题