论文标题
Devanagari脚本的完整角色识别和音译技术
A complete character recognition and transliteration technique for Devanagari script
论文作者
论文摘要
音译涉及基于两个独特脚本的字符之间的语音相似性,将一个脚本转换为另一个脚本。在本文中,我们提出了一种新型技术,用于使用角色识别来自动对Devanagari脚本的自动音译。隔离组成字符的第一个任务之一是分割。本手稿中的行分割方法讨论了重叠线的情况。字符分割算法旨在分割连词和单独的阴影字符。呈现的阴影字符分割方案采用连接的组件方法来隔离字符,使组成部分保持完整。统计特征,即区域,方差,偏度和峰度等不同的顺序时刻,以及字符的结构特征,在两个相识别过程中采用。识别后,组成的devanagari字符映射到相应的罗马字母的方式,其结果是罗马字母与源字符类似的发音。
Transliteration involves transformation of one script to another based on phonetic similarities between the characters of two distinctive scripts. In this paper, we present a novel technique for automatic transliteration of Devanagari script using character recognition. One of the first tasks performed to isolate the constituent characters is segmentation. Line segmentation methodology in this manuscript discusses the case of overlapping lines. Character segmentation algorithm is designed to segment conjuncts and separate shadow characters. Presented shadow character segmentation scheme employs connected component method to isolate the character, keeping the constituent characters intact. Statistical features namely different order moments like area, variance, skewness and kurtosis along with structural features of characters are employed in two phase recognition process. After recognition, constituent Devanagari characters are mapped to corresponding roman alphabets in way that resulting roman alphabets have similar pronunciation to source characters.