质量恒定量表通过两次学习的基于学习的速率因素预测来编码

论文标题

质量恒定量表通过两次学习的基于学习的速率因素预测来编码

Quality-Constant Per-Shot Encoding by Two-Pass Learning-based Rate Factor Prediction

论文作者

Cai, Chunlei, Wang, Yi, Li, Xiaobo, Ye, Tianxiao

论文摘要

提供质量恒定流可以同时保证用户体验并防止浪费位率。在本文中，我们提出了一种新型的基于深度学习的两通编码器参数预测框架来决定速率因子（RF），编码器可以通过恒定质量输出流。对于视频中的每个单发段，提出的方法首先通过超快的预处理提取空间，时间和预编码功能。基于这些功能，深度神经网络预测了RF参数。视频编码器使用RF作为第一个编码通过来压缩段。然后测量第一个通过编码的VMAF质量。如果质量不符合目标，则将执行第二通过的RF预测和编码。借助第一次通过预测的RF，并将相应的实际质量作为反馈，第二次通过预测将非常准确。实验表明，所提出的方法仅需要平均编码复杂性的1.55倍，同时精确度，压缩视频的实际VMAF在目标VMAF附近的$ \ pm1 $之内，达到98.88％。

Providing quality-constant streams can simultaneously guarantee user experience and prevent wasting bit-rate. In this paper, we propose a novel deep learning based two-pass encoder parameter prediction framework to decide rate factor (RF), with which encoder can output streams with constant quality. For each one-shot segment in a video, the proposed method firstly extracts spatial, temporal and pre-coding features by an ultra fast pre-process. Based on these features, a RF parameter is predicted by a deep neural network. Video encoder uses the RF to compress segment as the first encoding pass. Then VMAF quality of the first pass encoding is measured. If the quality doesn't meet target, a second pass RF prediction and encoding will be performed. With the help of first pass predicted RF and corresponding actual quality as feedback, the second pass prediction will be highly accurate. Experiments show the proposed method requires only 1.55 times encoding complexity on average, meanwhile the accuracy, that the compressed video's actual VMAF is within $\pm1$ around the target VMAF, reaches 98.88%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题