基于差异的立体声图像压缩，带对齐的跨视线先验

论文标题

基于差异的立体声图像压缩，带对齐的跨视线先验

Disparity-based Stereo Image Compression with Aligned Cross-View Priors

论文作者

Zhai, Yongqi, Tang, Luyang, Ma, Yi, Peng, Rui, Wang, Ronggang

论文摘要

随着立体声图像在各个领域的广泛应用，有关立体声图像压缩（SIC）的研究引起了学术界和工业的广泛关注。 SIC的核心是充分探索左图和右图之间的相互信息，并尽可能减少视图之间的冗余。在本文中，我们提出了Dispic，这是一种端到端可训练的深神经网络，在该网络中，我们共同训练立体声匹配模型以协助图像压缩任务。基于立体声匹配结果（即差异），右图像可以轻松地翘曲到左视图，并且仅对左图和右视图之间的残差编码为左图。在Dispic中采用了三个分支自动编码器架构，该体系结构分别编码正确的图像，差异图和残差。在培训期间，整个网络可以学习如何自适应将比特率分配给这三个部分，从而以较低的差异图比特率以较低的成本来实现更好的利率延伸性能。此外，我们提出了一个有条件的熵模型，该模型具有SIC的对齐的横视先验，该模型以右图像的扭曲潜在的潜伏期为先验，以提高左图的概率估计的准确性。实验结果表明，与Kitti和Instereo2K数据集上的其他现有SIC方法相比，我们提出的方法在定量和定性上都具有出色的性能。

With the wide application of stereo images in various fields, the research on stereo image compression (SIC) attracts extensive attention from academia and industry. The core of SIC is to fully explore the mutual information between the left and right images and reduce redundancy between views as much as possible. In this paper, we propose DispSIC, an end-to-end trainable deep neural network, in which we jointly train a stereo matching model to assist in the image compression task. Based on the stereo matching results (i.e. disparity), the right image can be easily warped to the left view, and only the residuals between the left and right views are encoded for the left image. A three-branch auto-encoder architecture is adopted in DispSIC, which encodes the right image, the disparity map and the residuals respectively. During training, the whole network can learn how to adaptively allocate bitrates to these three parts, achieving better rate-distortion performance at the cost of a lower disparity map bitrates. Moreover, we propose a conditional entropy model with aligned cross-view priors for SIC, which takes the warped latents of the right image as priors to improve the accuracy of the probability estimation for the left image. Experimental results demonstrate that our proposed method achieves superior performance compared to other existing SIC methods on the KITTI and InStereo2K datasets both quantitatively and qualitatively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题