论文标题
BCOT:无标记的高精度3D对象跟踪基准测试
BCOT: A Markerless High-Precision 3D Object Tracking Benchmark
论文作者
论文摘要
基于模板的3D对象跟踪仍然缺乏真实场景的高精度基准,因为很难在不使用标记的情况下注释真实移动视频对象的准确3D姿势。在本文中,我们提出了一种多视图方法,以估计真实移动对象的准确3D姿势,然后使用双眼数据来构造单程无纹理3D对象跟踪的新基准。所提出的方法不需要标记,并且相机仅需要同步,相对固定为跨视图和校准。基于我们以对象为中心的模型,我们通过最大程度地减少所有视图中的形状重新投影约束来共同优化对象姿势,这与单视图方法相比极大地提高了准确性,并且比基于深度的方法更准确。我们的新基准数据集包含20个无纹理对象,22个场景,404个视频序列和126K图像,这些图像在真实场景中捕获。根据理论分析和验证实验,注释误差保证小于2mm。我们使用数据集重新评估了最新的3D对象跟踪方法,并在真实场景中报告其性能排名。我们的BCOT基准和代码可以在https://ar3dv.github.io/bcot-benchmark/上找到。
Template-based 3D object tracking still lacks a high-precision benchmark of real scenes due to the difficulty of annotating the accurate 3D poses of real moving video objects without using markers. In this paper, we present a multi-view approach to estimate the accurate 3D poses of real moving objects, and then use binocular data to construct a new benchmark for monocular textureless 3D object tracking. The proposed method requires no markers, and the cameras only need to be synchronous, relatively fixed as cross-view and calibrated. Based on our object-centered model, we jointly optimize the object pose by minimizing shape re-projection constraints in all views, which greatly improves the accuracy compared with the single-view approach, and is even more accurate than the depth-based method. Our new benchmark dataset contains 20 textureless objects, 22 scenes, 404 video sequences and 126K images captured in real scenes. The annotation error is guaranteed to be less than 2mm, according to both theoretical analysis and validation experiments. We re-evaluate the state-of-the-art 3D object tracking methods with our dataset, reporting their performance ranking in real scenes. Our BCOT benchmark and code can be found at https://ar3dv.github.io/BCOT-Benchmark/.