论文标题
通过估计人们流动来计算人们
Counting People by Estimating People Flows
论文作者
论文摘要
在拥挤的场景中计算人们的现代方法依靠深层网络来估计各个图像中的人密度。因此,只有很少的人在视频序列中利用时间一致性,而那些确实在连续帧之间施加了弱平滑度约束的序列。在本文中,我们主张估计人们在连续图像之间跨图像位置流动,并从这些流中推断人密度,而不是直接回归。这使我们能够对编码人数的保护施加更强的约束。结果,它显着提高了性能,而无需更复杂的体系结构。此外,它使我们能够利用人们流量和光流之间的相关性,以进一步改善结果。我们还表明,以空间和时间的方式利用人们的保护限制,可以在积极的学习环境中训练一个深厚的人群计数模型,并以更少的注释。这大大降低了注释成本,同时仍导致与完整监督案例相似的性能。
Modern methods for counting people in crowded scenes rely on deep networks to estimate people densities in individual images. As such, only very few take advantage of temporal consistency in video sequences, and those that do only impose weak smoothness constraints across consecutive frames. In this paper, we advocate estimating people flows across image locations between consecutive images and inferring the people densities from these flows instead of directly regressing them. This enables us to impose much stronger constraints encoding the conservation of the number of people. As a result, it significantly boosts performance without requiring a more complex architecture. Furthermore, it allows us to exploit the correlation between people flow and optical flow to further improve the results. We also show that leveraging people conservation constraints in both a spatial and temporal manner makes it possible to train a deep crowd counting model in an active learning setting with much fewer annotations. This significantly reduces the annotation cost while still leading to similar performance to the full supervision case.