夹流：半监督迭代伪标记的对比度学习，以进行光流估计

论文标题

夹流：半监督迭代伪标记的对比度学习，以进行光流估计

CLIP-FLow: Contrastive Learning by semi-supervised Iterative Pseudo labeling for Optical Flow Estimation

论文作者

Zhang, Zhiqi, Bansal, Nitin, Cai, Changjiang, Ji, Pan, Yan, Qingan, Xu, Xiangyu, Xu, Yi

论文摘要

由于缺乏大量标记的真实现象数据，合成数据集通常用于端到端的光流网络预处理。但是，当从合成到真实场景移动时，准确性的大幅下降会发生。我们如何更好地将所学知识从合成域转移到真实领域？为此，我们提出了一个半监督的迭代伪标记框架，以将预科知识转移到目标真实域。我们利用大规模的，未标记的真实数据来促进转移学习，并在迭代更新的伪地真相标签的监督下，弥合了合成与真实之间的域间隙。此外，我们在参考特征和伪地面真相流的扭曲特征上提出了对比度流损失，以进一步增强准确的匹配并抑制由于运动，遮挡或嘈杂的伪标签而引起的不匹配。我们采用木筏作为骨干，并获得4.11％的F1 ALL错误，即从RAFT（5.10％）中降低了19％的误差（5.10％），并排名2 $^{nd} $在Kitti 2015基准上提交时放置。我们的框架也可以扩展到其他模型，例如Craft，在Kitti 2015 Benchmark上将F1的错误从4.79％降低到4.66％。

Synthetic datasets are often used to pretrain end-to-end optical flow networks, due to the lack of a large amount of labeled, real-scene data. But major drops in accuracy occur when moving from synthetic to real scenes. How do we better transfer the knowledge learned from synthetic to real domains? To this end, we propose CLIP-FLow, a semi-supervised iterative pseudo-labeling framework to transfer the pretraining knowledge to the target real domain. We leverage large-scale, unlabeled real data to facilitate transfer learning with the supervision of iteratively updated pseudo-ground truth labels, bridging the domain gap between the synthetic and the real. In addition, we propose a contrastive flow loss on reference features and the warped features by pseudo ground truth flows, to further boost the accurate matching and dampen the mismatching due to motion, occlusion, or noisy pseudo labels. We adopt RAFT as the backbone and obtain an F1-all error of 4.11%, i.e. a 19% error reduction from RAFT (5.10%) and ranking 2$^{nd}$ place at submission on the KITTI 2015 benchmark. Our framework can also be extended to other models, e.g. CRAFT, reducing the F1-all error from 4.79% to 4.66% on KITTI 2015 benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题