使用基于渲染的管道来综合注释的图像和视频数据，以改进车牌识别

论文标题

使用基于渲染的管道来综合注释的图像和视频数据，以改进车牌识别

Synthesizing Annotated Image and Video Data Using a Rendering-Based Pipeline for Improved License Plate Recognition

论文作者

Spruck, Andreas, Gruber, Maximilane, Maier, Anatol, Moussa, Denise, Seiler, Jürgen, Riess, Christian, Kaup, André

论文摘要

在神经网络应用中，不足的培训样本是一个常见的问题。虽然数据增强方法至少需要最少数量的样本，但我们提出了一种基于新颖的，基于渲染的管道，用于合成带注释的数据集。我们的方法不会修改现有样本，而是合成全新的样本。提出的基于渲染的管道能够在全自动过程中生成和注释合成和部分真实的图像和视频数据。此外，管道可以帮助获取真实数据。拟议的管道基于渲染过程。此过程生成综合数据。部分实现的数据使合成序列通过在采集过程中加入真实摄像机来使综合序列更接近现实。在自动车牌识别的背景下，广泛的实验验证证明了拟议的数据生成管道的好处，特别是对于具有有限的可用培训数据的机器学习方案。与仅在实际数据集中训练的OCR算法相比，该实验表明，角色错误率和错过率分别从73.74％和100％和41.27％的100％和41.27％的降低。这些改进是通过仅对合成数据训练算法来实现的。另外合并实际数据时，错误率可以进一步降低。因此，字符错误率和失误率可以分别降低至11.90％和39.88％。在实验过程中使用的所有数据以及针对自动数据生成的拟议的基于渲染的管道公开可用（URL将在出版时揭示）。

An insufficient number of training samples is a common problem in neural network applications. While data augmentation methods require at least a minimum number of samples, we propose a novel, rendering-based pipeline for synthesizing annotated data sets. Our method does not modify existing samples but synthesizes entirely new samples. The proposed rendering-based pipeline is capable of generating and annotating synthetic and partly-real image and video data in a fully automatic procedure. Moreover, the pipeline can aid the acquisition of real data. The proposed pipeline is based on a rendering process. This process generates synthetic data. Partly-real data bring the synthetic sequences closer to reality by incorporating real cameras during the acquisition process. The benefits of the proposed data generation pipeline, especially for machine learning scenarios with limited available training data, are demonstrated by an extensive experimental validation in the context of automatic license plate recognition. The experiments demonstrate a significant reduction of the character error rate and miss rate from 73.74% and 100% to 14.11% and 41.27% respectively, compared to an OCR algorithm trained on a real data set solely. These improvements are achieved by training the algorithm on synthesized data solely. When additionally incorporating real data, the error rates can be decreased further. Thereby, the character error rate and miss rate can be reduced to 11.90% and 39.88% respectively. All data used during the experiments as well as the proposed rendering-based pipeline for the automated data generation is made publicly available under (URL will be revealed upon publication).

下载PDF全文

下载文献需遵守相关版权规定

论文标题