用户名: 密码: 验证码:
模式识别技术及其在文字识别领域的应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
模式识别技术的研究目的是根据人的大脑识别的机理,通过计算机模拟,构造出能代替人完成分类和辨识的任务进行自动信息处理的机器系统。模式识别技术在社会生活和科学研究的许多方面有着巨大的现实意义,已经在许多领域得到了广泛应用。随着计算机技术和人工智能、思维科学研究的迅速发展,模式识别技术正在向更高、更深层次发展。人们己开始研究如何用计算机系统解释图像,实现类似人类视觉系统理解外部世界,这就是所谓的图像理解或计算机视觉,并且取得了不少重要的研究成果。这其中就包括文字识别技术。文字识别是一个典型的模式识别问题,也是模式识别中一个非常重要的应用领域。文字识别作为一种信息处理的手段,具有广阔的应用背景,巨大的市场需求是文字识别得以飞速发展的根本动力。因此,对文字识别的研究具有理论和应用的双重意义。
     本文全面阐明了文字识别中的特征提取和分类方法,对集成与分类之间的关系进行了深入的分析,然后根据综合集成法的基本思想,针对典型的汉字字符集的特点,提出了相应的识别和集成方法。在此基础上,建立了一个印刷体汉字识别系统。
     汉字字符集所具有的字量大、结构复杂和相似字多的特点,字量大导致了直接采用网络进行分类和集成的困难;而结构复杂和相似字多又使得传统的结构分析方法和统计识别方法难以取得满意的效果。针对这些问题,本文对所提出的网络集成方法进行了改进,给出了三个提取不同局部特征的最小距离分类器,并采用上述方法构成了集成型识别系统。测试结果表明,集成后的识别率比原来最好的单分类器高,充分说明了上述方法的有效性。
The object of pattern recognition technology is constructing a system to automatically classify, recognize and process information through computer simulation according to the mechanism of human's thinking. Pattern recognition technology is of great significance in living and science investigation, and has already been used in many fields. Character recognition is a very important and active research area in pattern recognition. Theoretically, it is not an isolated technique. It involves the problems that all the other areas of pattern recognition must face. Practically, as a kind of information processing technology, character recognition has a very broad application background. The need of market is the basic motive force of the rapid development of character recognition. Thus, it is of both theoretical and practical significance.
    In this thesis, the methods of feature extraction and classification frequently used in character recognition are demonstrated and the relationship between classification and integration is thoroughly analyzed. According to the basic idea of comprehensive integration, classification and integration methods are developed and a recognition system is established for typical character set.
    Chinese has the feature of a large vocabulary, complex structures and lots of similar characters. Large vocabulary brings about the difficulties in directly using neural network to classify and integrate. Complex structures and many similar characters make it very hard to use traditional structure analysis and statistical methods to get satisfying classification results. Aiming at these problems, the proposed network integration method is improved. Three minimum distance classifiers, which extract different local features, are proposed and they are combined to form an integration system by making use of the above methods. The measurement results show that the recognition rate of the integration system is higher than that of the best single classifier.
引文
[1] D. Shen, Horace H. S. Ip, Discriminative Wavelet Shape Descriptors for Recognition of 2-D Patterns, Pattern Recognition, 1999, 32, 151-165.
    [2] Z. Yang and F. Cohen, Cross-weighted Moments and Affine Invariants for Image Registration and Matching, IEEE Trans. PAMI, 1999, 21(8), 804-814.
    [3] R.Y. Wong, Scene Matching with Invariant Moments, Computer Graphics and Image Processing, 1978, 8, 16-24.
    [4] R.Y. Wong, E. L. Hall and J. Rouge, Hierarchical Search for Image Matching, in Proceedings of 1976 IEEE Conference on Decision and Control, December 1976.
    [5] Wen-Hao Wang and Yung-Chang Chen, Histogram Matching by Moment Normalization, IEICE Trans. Inf. & Syst., 1997, E80-D (7), 746-750.
    [6] I. Rothe, H. Susse and K. Voss, The Method of Normalization to Determine Invariants, IEEE Trans. PAMI, 1996, 18 (4), 366-376.
    [7] H. Zenkouar and A. Nachit, Images Compression Using Moments Method of Orthogonal Polynomials, Materials Science and Engineering, 1997, B49, 211-215.
    [8] D.G. Shen, K.K.T. Cheung and E.K. Teoh, Symmetry Detection by Generalized Complex (GC) Moments: A Close-Form Solution, IEEE Trans. PAMI, 1999, 21(5), 466-476.
    [9] Y.S. Abu-mostafa and D. Psaltis, Image Normalization by Complex Moments, IEEE Trans. PAMI, 1985, PAMI-7 (1), 46-55.
    [10] M. Tuceryan, Moment-based Texture Segmentation, Pattern Recognition Letters, 1994, 15, 659-668.
    [11] M. K. Hu, Pattern Recognition by Moment Invariants, Proc. IRE, 1961, 49,1428.
    [12] M. R. Teague, Image Analysis via the General Theory of Moments, J. Opt. Soc. Amer., 1980, 70, 920-930.
    [13] J. F. Boyce and W. J. Hossack, Moment Invariants for Pattern Recognition, Pattern Recognition Letters, 1983, 1(5-6), 451-456.
    
    
    [14] Y. S. Abu-Mostafa and D. Psaltis, Recognitive Aspects of Moment Invariants, IEEE Trans. Pattern Anal. Machine Intell., 1984, PAMI-6, 698-706.
    [15] Y. Yoshida and Y. Wu, Classification of Rotated and Scaled Textured Images Using Invariants based on Spectral Moments, IEICE Trans. Fundamentals, 1998, E81-A(8).
    [16] T. K. Ho, J. J. Hull and S. N. $fihari, Decision Combinatiton in Multiple Classifier Systems, IEEE Frans. Pattern Analysis and Machine Intelligence, 1994, 16 (1), 66-75.
    [17] J. Kittler, M. Hater, K.P.W. Duin and J. Matas, On Combining Classifiers, IEEE Trans. Pattern AnalvsiS and Machinelntelligence, 1998, 20 (3), 226-239。
    [18] H.W. Itao, X.H. Xiao and J.W. Iai, Handwritten Chinese character recegnition by metasynthetie approach, Pattern Recognition, 1997, 30(8), 1321-1328.
    [19] Adrian P.Whilchello and Hong yan, Linking broken character borders with variable sized marks to improve recognition. Patter Recognition, 1996, 29(8), 1429-1433.
    [20] J.P. Marques de Sa', Pattern Recognition Concepts, Methods and Applications, Springer-verlag Berlin Heideberg, New York, 2002.
    [21] D.M. SHI, R. I. DAMPER and S.R. GUNN, Offline Handwritten Chinese Character Recognition by Radical Decomposition, ACM Transactions on Asian Language Information Processing, 2003, 2(1), 27-48.
    [22] K. Jing and H.J. Kim, On-line recognition of cursive Korean characters using graph representation, Pattern Recogn. 2000, 33,399-412.
    [23] Y. Ge, Q. Huo and Z.-D. Feng, Offline recognition of handwritten Chinese characters using Gabor features, CDHMM modeling and MCE training. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'02, 2002, 1(Orlando, FL), 1053-1056.
    [24] P. Roychodhury, Y.P. Singhand R.A. Chansarkar, Hybridization of gradient descent algorithms with dynamic tunneling methods for global optimization, IEEE Trans. Syst., Man, Cybern-Part A: Systems and Humans, 2000, 30(3), 384-390.
    [25] D. Shi, S.R. Gunn and R.I. Damper, Handwritten Chinese character recognition using nonlinear active shape models and the Viterbi algorithm. Pattern Recogn. Lett., 2002, 23(14), 1853-1862.
    
    
    [26] Y. Xiong, Q. Huo and C.K. Chan, A discrete contextual stochastic model for the off line recognition of handwritten Chinese characters. IEEE Trans. Pattern Anal. Mach. Intell., 2001, 23(7),774-782.
    [27] F. Chung and W.W.S. IP, Complex character decomposition using deformable model. IEEE Trans. Syst. Man, Cybern.-Part C: Applications and reviews, 2001, 31(1), 126-132.
    [28] D. Shi, I. Robert, I. Damper and S.R. Gunn, An Approach to Off-Line Handwritten Chinese Character Recognition Based on Hierarchical Radical Decomposition, Journal of Quantitative Linguistics, 2003, 10 (1), 41-69.
    [29] M. Umeda and H. Yokota, Handwritten Chinese Character Recognition by Two-stage Composition of Neural Networks, IPSJ JOURNAL, 2001, 42 (11-020), 76.
    [30] Y. J. Chou and R. Shillcock, The Bilateral effect in Chinese character recognition, Proceedings of the Sixth International Conference on Cognitive and Neural Systems, Boston, 2002, USA.
    [31] Y. J. Chou and R. Shillcock, Global or Focal: The Recognition of Overlapped Chinese Characters at Two Spatial Scales. Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society, Fairfax, Virginia, 2002, USA.
    [32] K.J. Wang, X.D. Tian and B.L. Guo, Research on Chinese character recognition post-processing based on genetic algorithm, Proceedings of the First International Conference on machine learning and Cybemetics, IEEE, 2002, 4, 1718-1721.
    [33] G.H. Liu, H. Bao and W. Ch. Li, Implementing genetic algorithm program based on Matlab language, Application Research of Computers, 2001, 80-82.
    [34] X.W. Wang, X.Q. Ding and C.S. Liu, Character Extraction and Recognition in Natural Scene Images, ICDAR, 2001, 1084-1088.
    [35] C. Fang, C.S. Liu, L.R. Peng and X.Q. Ding, Automatic performance evaluation of printed Chinese character recognition systems. IJDAR, 2002, 4(3), 177-182
    [36] Z.G. Chen and X.Q. Ding, Rejection Algorithm for Mis-segmented Characters In Multilingual Document Recognition, ICDAR, 2003: 746-749.
    [37] 边肇琪,张学工,模式识别,清华大学出版社,2002。

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700