论文标题
UIT-HWDB:使用转移方法来构建一种新的基准测试,以评估越南语中无约束的手写图像识别
UIT-HWDB: Using Transferring Method to Construct A Novel Benchmark for Evaluating Unconstrained Handwriting Image Recognition in Vietnamese
论文作者
论文摘要
由于许多人的写作风格和写作语言的不同语言方面的写作风格巨大,因此识别手写图像是具有挑战性的。在越南人中,除了现代拉丁字符外,还有口音和字母标记以及对最新的手写识别方法的混乱的字符。此外,作为一种低资源语言,在越南语中没有很多用于研究手写识别的数据集,这使得以这种语言的手写识别具有使研究人员接触的障碍。最近的工作使用来自在线手写数据集中的图像通过连接笔冲程坐标而无需进一步处理来评估了越南人的离线手写识别方法。这种方法显然无法有效地衡量识别方法的能力,因为它是微不足道的,并且可能缺乏在离线手写图像中必不可少的功能。因此,在本文中,我们提出了传输方法来构建一个手写图像数据集,该数据集将离线手写图像所需的至关重要的自然属性关联。使用我们的方法,我们提供了第一个高质量的合成数据集,该数据集是复杂且自然的,可有效评估手写识别方法。此外,我们采用各种最新方法进行实验,以找出挑战,以达到越南人的手写识别解决方案。
Recognizing handwriting images is challenging due to the vast variation in writing style across many people and distinct linguistic aspects of writing languages. In Vietnamese, besides the modern Latin characters, there are accent and letter marks together with characters that draw confusion to state-of-the-art handwriting recognition methods. Moreover, as a low-resource language, there are not many datasets for researching handwriting recognition in Vietnamese, which makes handwriting recognition in this language have a barrier for researchers to approach. Recent works evaluated offline handwriting recognition methods in Vietnamese using images from an online handwriting dataset constructed by connecting pen stroke coordinates without further processing. This approach obviously can not measure the ability of recognition methods effectively, as it is trivial and may be lack of features that are essential in offline handwriting images. Therefore, in this paper, we propose the Transferring method to construct a handwriting image dataset that associates crucial natural attributes required for offline handwriting images. Using our method, we provide a first high-quality synthetic dataset which is complex and natural for efficiently evaluating handwriting recognition methods. In addition, we conduct experiments with various state-of-the-art methods to figure out the challenge to reach the solution for handwriting recognition in Vietnamese.