在时间动作细分中进行时间戳监督的广义且健壮的框架

论文标题

在时间动作细分中进行时间戳监督的广义且健壮的框架

A Generalized & Robust Framework For Timestamp Supervision in Temporal Action Segmentation

论文作者

Rahaman, Rahul, Singhania, Dipika, Thiery, Alexandre, Yao, Angela

论文摘要

在时间动作细分中，时间戳监督每个视频序列只需要少数标记的帧。对于未标记的框架，以前的作品依赖于分配硬标签，并且在微妙的违反注释假设的情况下，性能迅速崩溃。我们提出了一种基于新型期望最大化（EM）的方法，该方法利用了未标记框架的标签不确定性，并且足够强大以适应可能的注释误差。有了准确的时间戳注释，我们提出的方法会产生SOTA结果，甚至超过了几个指标和数据集中完全监督的设置。当应用于缺少动作段的时间戳注释时，我们的方法会呈现稳定的性能。为了进一步测试我们的配方鲁棒性，我们介绍了Skip-Tag监督的新挑战性注释设置。这种设置放松了约束，需要对视频中任何固定数量的随机帧进行注释，这使其比时间戳监督更灵活，同时保持竞争力。

In temporal action segmentation, Timestamp supervision requires only a handful of labelled frames per video sequence. For unlabelled frames, previous works rely on assigning hard labels, and performance rapidly collapses under subtle violations of the annotation assumptions. We propose a novel Expectation-Maximization (EM) based approach that leverages the label uncertainty of unlabelled frames and is robust enough to accommodate possible annotation errors. With accurate timestamp annotations, our proposed method produces SOTA results and even exceeds the fully-supervised setup in several metrics and datasets. When applied to timestamp annotations with missing action segments, our method presents stable performance. To further test our formulation's robustness, we introduce the new challenging annotation setup of Skip-tag supervision. This setup relaxes constraints and requires annotations of any fixed number of random frames in a video, making it more flexible than Timestamp supervision while remaining competitive.

下载PDF全文

下载文献需遵守相关版权规定

论文标题