用户名: 密码: 验证码:
脱机中文手写识别—从孤立汉字到真实文本
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
鉴于巨大的应用潜力和附加的特别难度,脱机手写汉字识别吸引了大批的研究者。近三十年的研究,主要集中在工笔手写汉字识别方面。产出的成果囊括了手写矫形、特征提取、分类器设计以及语言后处理等各个方面,进入手写文本时代的条件基本成熟。本文旨在建立脱机中文手写文本识别的基本框架,涵盖了从基础数据到评价体系,再从改进的方法到全新研究策略等一系列内容。首先构建了能够支撑中文手写文本研究任务的基础数据,HIT-MW库;并在理解问题的过程中,定义了评价字符切分和识别算法的度量准则。然后分别从切分策略和无切分策略两条不同路径开展手写文本识别方法的研究。最后,在证实切分策略和无切分策略存在明显互补性的基础上,提出基于双策略的组合系统。
     本文分析了手写汉字识别的未来发展趋势并给出研究的逻辑结构。首先以识别对象的升级为主线,系统总结了文字识别研究的发展历史。通过分析发展历史,并结合汉字识别研究在手写库建设和识别策略方面的研究现状,指出中文手写文本识别将是未来的研究重心。这将进入一个新的时代—“手写文本时代”。新生时代是在手写单字时代基础上的进一步发展,所以,随后评述了手写孤立汉字识别领域在手写矫形、特征提取、分类器设计以及语言后处理等各个方面的重要研究成果。
     本文从全新角度构建了HIT-MW库。HIT-MW库是国际上首个文本级别的中文手写库,它的收集成功昭示着手写文本时代的开端。它的抄写文本来自人民日报语料库,涵盖了约800万字语料的99.33%用字。书写者经过精心确定,得到了与实际分布基本吻合的统计数据。经过系统的采样策略和缜密的过程控制,HIT-MW库不仅包含歪斜、交叠和粘连的文本行,还有抄写错误、文字涂改等真实手写现象。大量的支撑证据表明,这些基础数据可以视为全体中文手写文本的代表子集;其上的识别结果,具有统计意义。目前,该库已为十多家科研机构采用。
     本文不仅定义了文本研究的评价准则,还从切分角度进行了方法研究。首先建立了文本切分和识别的基本评价准则。为评价文本的识别优劣,定义了识别正确率和识别准确率。两种准则可以有效刻画系统在删除错误、插入错误和替换错误上的平衡能力。为了评价不同字符切分方法,定义了切分正确率、切分精确率和切分偏差率等准则。综合应用这三种准则,可以发现切分方法在数字、标点和汉字等不同字符类型上的切分能力以及在过切分和弱切分上的偏向性。其次开展了基于切分策略的真实文本识别研究并提供了两个重要建议。第一,在设计新算法时,如果其支持证据仅依据于一种手写矫形配置上表现出的优势,那么其可信性可能并不成立;理想的方案是比较待评价新、旧系统各自最优手写矫形配置上的结果。第二,MQDF分类器需要改进,以加入先验概率信息,进一步的分析显示,采用大规模语料估计的先验信息比直接从训练集估计的先验更具稳定性。
     本文提出基于无切分策略的真实中文手写文本识别方法。该方法在训练时直接采用手写行,不需要对字符位置进行标记;识别时无需字符切分阶段。采用同类型特征的切分系统和无切分系统间的对比实验,证实了无切分策略的可行性和巨大潜力。在这一研究框架下,针对四平面交叉特征的弱点,提出增强的四平面交叉特征(en-FPF)。与以前的方向平面不同,en-FPF的方向平面包含了重构原始图像的全部重要信息。实验表明,en-FPF在数字、标点和汉字上均有更好的识别性能,也是目前无切分框架下各项识别率最高的单项特征。en-FPF在融合了简单的网格特征,并结合主成分分析和数据共享方法之后,对汉字的识别正确率,在训练数据稀疏的条件下,仍超过50%。
     本文在验证了两种识别策略的互补性的基础上,分别设计了串行结构和并行结构的双策略组合系统。首先定义了字符匹配率用以反映两系统在某个识别正确率上的互补能力。在这一准则的辅助下,发现两种识别策略甚至在同样训练数据和同类型特征下,仍可以很好的相互补充。随后,设计了两种双策略组合系统,扩展了多分类器研究的内容和范围。串行结构的组合系统把无切分识别器插入到切分系统的字符切分阶段。这一组合结构是在识别过程中,先启动无切分系统,随后启动切分系统。并行结构的组合系统预先以并行方式执行切分和无切分系统,然后由切分系统的度量值决定是直接输出还是转而输出无切分的结果。实验结果证实了双策略组合系统的显著效力。
Owing to its huge potentials in application and appealing challenges in intellect, off-line recognition of handwritten Chinese character has been intensively studied by numerous researchers. Great efforts have been made to reliably identify handprinted Chinese characters during the last three decades. Accordingly, considerable advances have been achieved, covering shape normalization, feature extraction, classifier design, and linguistic postprocessing. All the fruits in the state of the art qualify the emergence of the era of handwritten text. This thesis motivates to establish the fundamental framework for the off-line recognition of Chinese handwritten text. Its contribution ranges from gathering essential data to defining evaluation criteria and from enhancing traditional methods to putting forward novel strategies. As the first step, HIT-MW database is presented to facilitate the off-line recognition task of Chinese handwritten text. To a preferable assessment, a series of evaluation criteria are then defined for the character segmentation and text recognition. Subsequently, the recognition problem is undertaken in two distinct strategies, the segmentation-based strategy and segmentation-free one. Finally, two-strategy combination systems are proposed, seeing clear complementary capacities upon the segmentation-based and the segmentation-free ones.
     This thesis attempts to infer the future trends and to direct the logical structure. The history of off-line character recognition is first systematically summarized, focusing on the upgrade of the recognition unit. Further reflecting on the-state-of-the-art techniques of Chinese character recognition in the collection of database and recognition method, Chinese handwritten text recognition will be the next trend. A new era comes into being which can be termed as "the era of handwritten text". Since the new era is originated from "the era of isolated character", survey on and comprehension of the recognition techniques are conducted for handwritten isolated Chinese character, and most achievements are investigated under the head of shape normalization, feature extraction, classifier design and linguistic postprocessing, respectively.
     This thesis establishes the HIT-MW database from a novel perspective. The database is the first text-level database of Chinese handwriting in the domain, whose success initiates the new era of handwritten text. The underlying texts of the database are sampled from China Daily Corpus and as a result, high character coverage of 99.33% is obtained on a large corpus with about 80 million characters. The writers are carefully determined and their distributions well match the real statistic. Due to its systematic sampling mechanism and strict assurance process, not only are skew, overlapping and touching textlines are included, but realistic phenomena, such as mis-writing, erasure are catched. Enough evidences support that HIT-MW database can be used to represent the whole population of Chinese handwritten text, and that the recognition results on it hold in statistics. Currently, the database is used by dozens of research groups throughout the world.
     This thesis first presents the basic evaluation criteria for character segmentation and text recognition. To encode the balance ability among deletion error, insertion error and substitution error, the recognition correct rate and the recognition accuracy rate are defined. To compare different character segmentation methods, the segmentation correct rate, the segmentation precision rate and the segmentation bias rate are provided. Utilizing the three segmentation rates, the segmentation ability in digits, punctuation marks and Chinese characters, and the preference in under segmentation or over segmentation can be discovered. In addition, the transcription of realistic handwritten text based on segmentation-based strategy is studied and two crucial suggestions are given. First, the advantages of new method may be of doubt, if the evidence is merely collected from single setup of shape normalization. Instead, their results should be compared under their own best setup of shape normalization. Second, the performance of classifiers based on modified quadratic discriminant function will be clearly improved after incorporating the a priori of character class, and further using the corpus rather than training data to estimate the a priori yields more robust results.
     This thesis proposes a segmentation-free strategy to transcribe the realistic handwritten Chinese text. During the training process, character positions are not needed. Comparisons are conducted with segmentation-based system of the same type of features and the results show the great feasibility and potential of this strategy. An enhanced four plane feature (en-FPF) within the segmentation-free recognition framework is also proposed. Unlike the previous directional planes, the planes of en-FPF can reconstruct the original image. Experimental results show that en-FPF yields bet- ter recognition performance and it yields the highest recognition rates if just one kind of feature is used. Once the fusion of en-FPF and simple cellular feature is processed with principal component analysis and data sharing techniques, the recognition correct rate of Chinese characters exceeds 50%, even when it is disturbed by the problem of data sparseness.
     This thesis combines the segmentation-based strategy and the segmentation-free one with serial structure and parallel structure, respectively, seeing their potential complementary capacities. To explore the complementary capacities between two systems, character matching rate (CMR) is defined first. With the help of CMR, the complementary capacities are verified between two strategies, even when they employ the same training data and the same type of feature. Then two combined systems are constructed adopting a serial combination structure and a parallel combination structure, respectively. The methods expand the research contents and ranges of multiple classifier combination. In the former, segmentation-free system is used to estimate the initial character boundaries. After a boundary refinement process, the segmentation-based system is launched. In the latter, segmentation-free system can be started simultaneously with segmentation-based system and then the recognition confidence of segmentation-based system is used to determine whose result should be delivered. Experimental results manifest the effectiveness of the combinations.
引文
1 G. Tauschek. Reading Machine. USA, 2026329. 1935
    
    2 S. Mori, C. Y. Suen, K. Yamamoto. Historical Review of OCR Research and Development. Proceedings of the IEEE. 1992, 80(7): 1029-1058
    
    3 S. Srihari, X. Yang, G. Ball. Offline Chinese Handwriting Recognition: An Assessment of Current Technology. Frontiers of Computer Science in China.2007, 1(2): 137-155
    
    4 R. Dai, C. Liu, B. Xiao. Chinese Character Recognition: History, Status and Prospects. Frontiers of Computer Science in China. 2007, 1(2): 126-136
    
    5 H. Fujisawa. Forty Years of Research in Character and Document Recognition: An Industrial Perspective. Pattern Recognition. 2008, 41(8):2435-2446
    
    6 A. Vinciarelli. A Survey on Off-line Cursive Word Recognition. Pattern Recognition. 2002, 35(7): 1433-1446
    
    7 G. Lorette. Handwriting Recognition Or Reading? What Is the Situation at the Dawn of the 3rd Millennium? International Journal on Document Analysis and Recognition. 1999, 2:2-12
    
    8 R. G. Casey, E. Lecolinet. A Survey of Methods and Strategies in Character Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence.1996, 18(7):690-706
    
    9 T. H. Hildebrandt, W. Liu. Recognition of Handwritten Chinese Characters:Advances Since 1980. Pattern Recognition. 1993, 26(2):205-225
    
    10 G. Nagy. At the Frontiers of OCR. Proceedings of the IEEE. 1992, 7:1093-1100
    
    11 R. Plamondon, S. N. Srihari. Online and Off-line Handwriting Recognition: A Comprehensive Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000, 22(1):63-84
    
    12 C.-L. Liu, H. Fujisawa. Classification and Learning Methods for Character Recognition: Advances and Remaining Problems. Machine Learning in Document Analysis and Recognition. 2008:139-161
    
    13 丁晓青.汉字识别研究的回顾.电子学报.2002, 30(9):1364-1368
    14 W.S.Rohland.Character Sensing System.USA,2877951.1959
    15 R.B.Johnson.Indicia Controlled Record Perforating Machine.USA,2741312.1956
    16 I.Iijima,Y.Okumura,K.Kuwabara.New Process of Character Recognition Using Sieving Method.Information and Control Research.1963,1(1):30-35
    17 W.Highleyman.An Analog Method for Character Recognition.IRE Transactions on Electronic Computers.1961,EC-10:502-512
    18 E.C.Greanias.Some Important Factors in the Practical Utilization of Optical Character Readers.Optical Character Recognition.1962:129-146
    19 W.S.Rholand,R J.Traglia,R J.Hurley.The Design of an OCR System for Reading Handwritten Numerals.Proceedings of the Fall Joint Computer Conference.Montvale,N.J,1968:1151-1162
    20 H.Genchi,S.Mori,Watanabe,et al.Recognition of Handwritten Numerical Characters for Automatic Letter Sorting.Proceedings of the IEEE.1968,56(8):1292-1301
    21 I.Sheinberg.The INPUT-2 Document Reader.Pattern Recognition.1970,2(3):167-173
    22 R.Casey,G.Nagy.Recognition of Printed Chinese Characters.IEEE Transactions on Electronic Computers.1966,EC-15(1):91-101
    23 吴佑寿.教电脑识字-浅谈汉字识别.清华大学出版社,2000
    24 J.H.Munson.Experiments in the Recognition of Hand-printed Text:Part Ⅰ-Character Recognition.Proceedings of Fall Joint Computer Conference.Washington,DC,1968:1125-1138
    25 戴汝为,郝红卫,肖旭红.汉字识别的系统与集成.浙江科学技术出版社,1998
    26 C.Y.Suen,M.Berthod,S.Mori.Automatic Recognition of Handprinted Characters-the State of the Art.Proceedings of the IEEE.1980,68(4):469-487
    27 S.Mori,K.Yamamoto,H.Yamada,et al.On a Handprinted Kyoiku-Kanji Character Data Base.Bulletin of the Electrotechnical Laboratory.1979,43(11-12):752-773
    28 K.Sayre.Machine Recognition of Handwritten Words:A Project Report.Pattern Recognition.1973,5(3):213-228
    29 R. Ehrich, K. Koehler. Experiments in the Contextual Recognition of Cursive Script. IEEE Transactions on Computers. 1975, C-24(2):182- 194
    
    30 R. Farag. Word-level Recognition of Cursive Script. IEEE Transactions on Computers. 1979, C-28(2):172-175
    
    31 M. Yasuda, K. Yamamoto, H. Yamada, et al. An Improved Correlation Method for Character Recognition in a Reciprocal Feature Field. IECE Trans. 1985,J68-D(3):353-360
    
    32 T. Saito, H. Yamada, K. Yamamoto. On the Data Base ETL9 of Handprinted Characters in JIS Chinese Characters and its Analysis. IEICE Transactions.1985, J68-D(4):757-764
    
    33 J. W. Tai, C. Liu, L. Q. Zhang. A New Approach for Feature Extraction and Feature Selection of Handwritten Chinese Character Recognition. From Pixels to Features Ill-Frontiers in Handwritten Recognition. 1992:479-489
    
    34 Y. J. Liu, J. W. Tai, J. Liu. An Introduction to the 4 Million Handwriting Chinese Character Samples Library. Proceedings of the International Conference on Chinese Computing and Orient Language Processing. Changsha, China,1989:94-97
    
    35 L.-T. Tu, Y S. Lin, C. P. Yeh, et al. Recognition of Handprinted Chinese Characters by Feature Matching. Proceedings of the 1st National Workshop on Character Recognition. 1991:166-175
    
    36 Y. Y. Tang, L.-T. Tu, J. Liu, et al. Off-line Recognition of Chinese Handwriting by Multifeature and Multilevel Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1998, 20(5):556-561
    
    37 D. H. Kim, Y. S. Hwang, S. T. Park, et al. Handwritten Korean Character Image Database PE92. Proceedings of the 2nd International Conference on Document Analysis and Recognition. Tsukuba, Japan, 1993:470-473
    
    38 D. H. Kim, Y. S. Hwang, S. T. Park, et al. Handwritten Korean Character Image Database PE92. IEICE Transactions on Information and Systems. 1996, E79-D(7):943-950
    
    39 C. Y. Suen, C. Nadal, R. Legault, et al. Computer Recognition of Unconstrained Handwritten Numerals. Proceedings of the IEEE. 1992, 80(7): 1162-1180
    
    40 J. Hull. A Database for Handwritten Text Recognition Research. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1994, 16(5):550-554
    41 A.W.Senior,A.J.Robinson.An Off-line Cursive Handwriting Recognition System.IEEE Transactions on Pattern Analysis and Machine Intelligence.1998,20(3):309-321
    42 金连文.SCUT-IRAC最新手写体汉字图像样本库.中文信息.1998,(1):101-102
    43 金连文.手写体汉字识别的研究.华南理工大学博士论文.1996
    44 U.V.Marti,H.Bunke.A Full English Sentence Database for Off-line Handwriting Recognition.Proceedings of the 5th International Conference on Document Analysis and Recognition.Bangalore,India,1999:705-708
    45 U.Marti,H.Bunke.The IAM-database:An English Sentence Database for Offline Handwriting Recognition.International Journal on Document Analysis and Recognition.2002,5(1):39-46
    46 C.Viard-Gaudin,R M.Lallican,S.Knerr,et al.The IRESTE On/Off (IRONOFF)Dual Handwriting Database.Proceedings of the 5th International Conference on Document Analysis and Recognition.Bangalore,India,1999:455-458
    47 郭军,蔺志青,张洪刚.一个新的脱机手写汉字数据库模型及其应用.电子学报.2000,28(5):115-116
    48 H.Zhang,J.Guo.Introduction to HCL2000 Database.Proceedings of Sino-Japan Symposium on Intelligent Information Networks.Beijing,2000
    49 E.Kavallieratou,N.Liolios,E.Koutsogeorgos,et al.GRUHD:A Greek Database of Unconstrained Handwriting.Proceedings of the 2nd International Conference on Language Resources and Evaluation.Athens,Greece,2000:1755-1759
    50 E.Kavallieratou,N.Liolios,E.Koutsogeorgos,et al.The GRUHD Database of Greek Unconstrained Handwriting.Proceedings of the 6th International Conference on Document Analysis and Recognition.Seattle,WA,USA,2001:561-565
    51 J.S.Park,H.J.Kang,S.W.Lee.Automatic Quality Measurement of Gray-scale Handwriting Based on Extended Average Entropy.Proceedings of the 15th International Conference on Pattern Recognition.Barcelona,Spain,2000:426-429
    52 C. Y. Suen, S. Mori, S. H. Kim, et al. Analysis and Recognition of Asian Scripts- the State of the Art. Proceedings of the 7th International Conference on Document Analysis and Recognition. Edinburgh, Scotland, 2003:866-878
    
    53 Y. Ge, Q. Huo. A Comparative Study of Several Modeling Approaches for Large Vocabulary Offline Recognition of Handwritten Chinese Characters. Proceedings of the 16th International Conference on Pattern Recognition. Quebec,Canada, 2002:85-88
    
    54 Y. Ge, Q. Huo, Z. D. Feng. Offline Recognition of Handwritten Chinese Characters Using Gabor Features, CDHMM Modeling and MCE Training. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Orlando, FL, 2002:1053-1056
    
    55 葛勇.脱机手写汉字识别的统计建模方法研究.中国科学技术大学博士论文. 2002
    
    56 U. Bhattacharya, B. B. Chaudhuri. Databases for Research on Recognition of Handwritten Characters of Indian Scripts. Proceedings of the 8th International Conference on Document Analysis and Recognition. Seoul, Korea, 2005:789-793
    
    57 U. V. Marti, H. Bunke. Handwritten Sentence Recognition. Proceedings of the 15th International Conference on Pattern Recognition. 2000:463-466
    
    58 U. V. Marti, H. Bunke. Using a Statistical Language Model to Improve the Performance of an HMM-based Cursive Handwriting Recognition System. International Journal of Pattern Recognition and Artificial Intelligence. 2001,15(1):65-90
    
    59 U. Marti, H. Bunke. Towards General Cursive Script Recognition. Proceedings of the 6th International Workshop on Frontiers in Handwriting. Taejon,South Korea, 1998:379-388
    
    60 G. Kim, V. Govindaraju, S. N. Srihari. An Architecture for Handwritten Text Recognition Systems. International Journal on Document Analysis and Recognition. 1999, 2(1):37-44
    
    61 F. Bortolozzi, A. S. B. Jr, L. S. Oliveira, et al. Recent Advances in Handwriting Recognition. Document Analysis. Kalkuta, 2005:1-30
    
    62 A. Vinciarelli, S. Bengio, H. Bunke. Offline Recognition of Unconstrained Handwritten Texts Using HMMs and Statistical Language Models. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004, 26(6):709-720
    63 国家技术监督局.GB2312-1980信息交换用汉字编码字符集-基本集.中国标准出版社,1980
    64 M.Cheriet,N.Kharma,C.-L.Liu,et al.Character Recognition Systems:A Guide for Students and Practitioners.John Wiley and Sons,2007
    65 C.-L.Liu.Handwritten Chinese Character Recognition:Effects of Shape Normalization and Feature Extraction.Arabic and Chinese Handwriting Recognition.2008:104-128
    66 L.Jin,G.Wei.Handwritten Chinese Character Recognition with Directional Decomposition Cellular Features.Journal of Circuit,System and Computer.1999,8(4):517-524
    67 高学,金连文,尹俊勋.一种基于笔画密度的弹性网格特征提取方法.模式识别与人工智能.2002,15(3):351-354
    68 高学.基于运动图像的手写汉字识别研究.华南理工大学博士论文.2003
    69 H.Yamada,K.Yamamoto,T.Saito.A Nonlinear Normalization Method for Handprinted Kanji Character Recognition-Line Density Equalization.Pattern Recognition.1990,23(9):1023-1029
    70 J.Tsukumo,H.Tanaka.Classification of Handprinted Chinese Characters Using Non-linear Normalization and Correlation Methods.Proceedings of the 9th International Conference on Pattern Recognition.Rome,Italy,1988:168-171
    71 R.C.Casey.Moment Normalization of Handprinted Characters.IBM Journal of Research and Development.1970,14(5):548-557
    72 C.-L.Liu,H.Sako,H.Fujisawa.Handwritten Chinese Character Recognition:Alternatives to Nonlinear Normalization.Proceedings of the 7th International Conference on Document Analysis and Recognition.Edinburgh,Scotland,2003:524-528
    73 C.-L.Liu,K.Marukawa.Global Shape Normalization for Handwritten Chinese Character Recognition:A New Method.Proceedings of the 9th International Workshop on Frontiers in Handwriting Recognition.2004:300-305
    74 C.-L.Liu,K.Marukawa.Pseudo Two-dimensional Shape Normalization Methods for Handwritten Chinese Character Recognition.Pattern Recognition.2005,38(12):2242-2255
    75 吴天雷,马少平.基于重叠动态网格和模糊隶属度的手写汉字特征抽取.电子学报.2004,32(2):186-190
    76 吴佑寿,丁晓青.汉字识别-原理方法与实现.高等教育出版社,1992
    77 H.Hao,X.Xiao,R.Dai.Handwritten Chinese Character Recognition by Meta-synthetic Approach.Pattern Recognition.1997,30(8):1321-1328
    78 N.Kato,M.Suzuki,S.Omachi,et al.A Handwritten Character Recognition System Using Directional Element Feature and Asymmetric Mahalanobis Distance.IEEE Transactions on Pattern Analysis and Machine Intelligence.1999,21(3):258-262
    79 Y.Chen,X.Ding,Y.Wu.Off-line Handwritten Chinese Character Recognition Based on Crossing Line Feature.Proceedings of the 4th International Conference on Document Analysis and Recognition.Ulm,Germany,1997:206-210
    80 陈友斌,丁晓青,吴佑寿.一种手写汉字特征抽取的新方法.信号处理.1998,14(2):117-122
    81 X.Wang,X.Ding,C.Liu.Gabor Filters-based Feature Extraction for Character Recognition.Pattern Recognition.2005,38(3):369-379
    82 吴锐,刘家锋,唐降龙,等.基于Gabor小波变换的汉字识别方法.高技术通讯.2005,15(3):7-10
    83 王学文,丁晓青,刘长松.基于Gabor变换的高鲁棒汉字识别新方法.电子学报.2002,30(9):1317-1322
    84 J.G.Daugman.Two-dimensional Spectral Analysis of Cortical Receptive Field Profiles.Vision Research.1980,20(10):847-856
    85 L.Shen,L.Bai.A Review on Gabor Wavelets for Face Recognition.Pattern Analysis and Applications.2006,9:273-292
    86 王海晶.AdaBoost学习机制及其在物体检测和识别中的应用.哈尔滨工业大学博士论文.2007
    87 C.Liu.Gabor-based Kernel PCA with Fractional Power Polynomial Models for Face Recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence.2004,26(5):572-581
    88 J.G.Daugman.Probing the Uniqueness and Randomness of Iriscodes:Results from 200 Billion Iris Pair Comparisons.Proceedings of the IEEE.2006,94(11):1927-1935
    89 W.K.Kong,D.Zhang,W.Li.Palmprint Feature Extraction Using 2-D Gabor Filters.Pattern Recognition.2003,36(10):2339-2347
    90 Y.Hamamoto,S.Uchimura,M.Watanabe,et al.A Gabor Filter-based Method for Recognizing Handwritten Numerals.Pattern Recognition.1998,31(4):395-400
    91 R.O.Duda,R E.Hart,D.G.Stork.Pattern Classification(Second Edition).John Wiley and Sons,2001
    92 J.H.Friedman.Regularized Discriminant Analysis.Journal of the American Statistical Association.1989,84(405):165-175
    93 F.Kimura,K.Takashina,S.Tsuruoka,et al.Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence.1987,9(1):149-153
    94 R.Zhang,X.Ding.Minimum Classification Error Training for Handwritten Character Recognition.Proceedings of the l6th International Conference on Pattern Recognition.2002:580-583
    95 Y.Yao.Handprinted Chinese Character Recognition via Neural Networks.Pattern Recognition Letters.1988,7(1):19-25
    96 B.S.Jeng,S.W.Sun.Chinese Character Recognition with Neural Nets Classifier.IEEE International Conference on Acoustic,Speech,and Signal Processing.New York,1990:2125-2128
    97 C.-L.Liu,K.Nakashima,H.Sako,et al.Handwritten Digit Recognition:Benchmarking of State of the Art Techniques.Pattern Recognition.2003,36(10):2271-2285
    98 C.-L.Liu,M.Nakagawa.Evaluation of Prototype Learning Algorithms for Nearest-neighbor Classifier in Application to Handwritten Character Recognition.Pattern Recognition.2001,34(3):601-615
    99 Y.LeCun,L.Bottou,Y.Bengio,et al.Gradient-based Learning Applied to Document Recognition.Proceedings of the IEEE.1998,86(1):2278-2324
    100 V.Vapnik.The Nature of Statistical Learning Theory.Springer,1995
    101 J.-X.Dong,A.Krzyzak,C.Y.Suen.An Improved Handwritten Chinese Character Recognition System Using Support Vector Machine.Pattern Recognition Letters.2005,26(12):1849-1856
    102 J.-X.Dong,A.Krzyzak,C.Y.Suen.Fast SVM Training Algorithm with Decomposition on Very Large Data Sets.IEEE Transactions on Pattern Analysis and Machine Intelligence.2005,27(4):603-618
    103 刘志斌,金连文.格子SVM-汉字识别中的新方法.第一届全国模式识别会议(CCPR).北京,2007:285-291
    104 Y.Ephraim,N.Merhav.Hidden Markov Processes.IEEE Transactions on Information Theory.2002,48(6):1518-1569
    105 L.E.Baum,T.Petrie.Statistical Inference for Probabilistic Functions of Finite State Markov Chains.Annual Mathematics Statistics.1966,37(6):1554-1563
    106 T.Petrie.Probabilistic Functions of Finite State Markov Chains.Annual Mathematics Statistics.1969,40(1):97-115
    107 L.E.Baum,T.Petrie,G.Soules,et al.A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains.Annual Mathematics Statistics.1970,41:164-171
    108 J.Raviv.Decision Making in Markov Chains Applied to the Problem of Pattern Recognition.IEEE Transactions on Information Theory.1967,IT-3(4):536-551
    109 F.Jelinek,L.R.Bahl,R.L.Mercer.Design of a Linguistic Statistical Decoder for Recognition of Continuous Speech.IEEE Transactions on Information Theory.1975,IT-21(3):250-256
    110 J.K.Baker.The DRAGON System-an Overview.IEEE Transactions on Acoustics,Speech and Signal Processing.1975,ASSP-23(1):24-29
    111 L.R.Rabiner.A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.Proceedings of the IEEE.1989,77(2):257-285
    112 K.F.Lee.Context-dependent Phonetic Hidden Markov Models for Speaker-independent Continuous Speech Recognition.IEEE Transactions on Acoustics,Speech and Signal Processing.1990,3(4):599-609
    113 E.Charniak.Statistical Language Learning.MIT Press,1993
    114 A.Krogh,M.Brown.Hidden Markov Models in Computational Biology:Applications to Protein Modelling.Journal of Molecular Biology.1994,235:1501-1531
    115 B.-S.Jeng,M.-W.Chang,S.-W.Sun,et al.Optical Chinese Character Recognition with a Hidden Markov Model Classifier-a Novel Approach.Electronics Letters.1990,26(18):1530-1531
    116 B.Feng,X.Ding,Y.Wu.Chinese Handwriting Recognition Using Hidden Markov Models.Proceedings of the 16th International Conference on Pattern Recognition.Quebec,Canada,2002:212-215
    117 冯兵,丁晓青,吴佑寿.HMM方法识别脱机手写汉字.模式识别与人工智能.2002,15(1):84-88
    118 O.E.Agazzi,S.S.Kuo.Hidden Markov Model Based Optical Character Recognition in the Presence of Deterministic Transformations.Pattern Recognition.1993,26(12):1813-1826
    119 H.S.Park,S.W.LEE.A Truly 2-D Hidden Markov Model for Off-line Hand-written Character Recognition.Pattern Recogntion.1998,31(2):1894-1864
    120 H.S.Park,S.W.LEE.A 2-D HMM Method for Offline Handwritten Character Recognition.International Journal of Pattern Recognition and Artificial Intelligence.2001,15(1):91-105
    121 R-K.Wong,C.Chan.Post-processing Statistical Language Models for a Hand-written Chinese Character Recognizer.IEEE Transactions on System,Man,and Cybernetics-Part B.1999,29(2):286-291
    122 李元祥,丁晓青,吴佑寿.一种基于字词结合的汉字识别上下文处理新方法.计算机研究与发展.2002,39(7):838-842
    123 Y.Li,X.Ding,C.L.Tan.Combining Character-based Bigrams with Word-based Bigrams in Contextual Postprocessing for Chinese Script Recognition.ACM Transactions on Asian Language Information Processing.2002,1(4):297-309
    124 Y.Li,C.L.Tan,X.Ding.A Hybrid Post-processing System for Offline Hand-written Chinese Script Recognition.Pattern Analysis and Applications.2005,8(3):272-286
    125 Y.Li,C.L.Tan.An Empirical Study of Statistical Language Models for Contextual Post-processing of Chinese Script Recognition.Proceedings of the 9th International Workshop on Frontiers in Handwriting Recognition.2004:257-262
    126 Y.Li,C.L.Tan.Influence of Language Models and Candidate Set Size on Contextual Post-processing for Chinese Script Recognition.Proceedings of the 17th International Conference on Pattern Recognition.2004:537-540
    127 Y.Li,C.L.Tan,X.Ding,et al.Contextual Post-processing Based on the Confusion Matrix in Offline Handwritten Chinese Script Recognition.Pattern Recognition.2004,37(9):1901-1912
    128 张卿华.汉字笔迹与个性测评研究.心理科学.1998,21(4):301-305
    129 胡迎梅,杨丽,胡益清,等.书写习惯未定型成年人笔迹检验.刑事技术.2005,(5):34-36
    130 D.S.Moore.Statistics:Concepts and Controversies(Fifth Edition).W.H.Freeman,2001
    131 N.Otsu.A Threshold Selection Method from Gray-level Histogram.IEEE Transactions on System,Man,and Cybernetics.1979,SMC-9(1):62-66
    132 E.Kavallieratou,N.Fakotakis,G.Kokkinakis.Skew Angle Estimation for Printed and Handwritten Documents Using the Wigner-Ville Distribution.Image and Vision Computing.2002,20(11):813-824
    133 张一清,佟乐泉.视觉因素在儿童书写汉字中的作用-实验报告.现代汉语用字信息分析.1993:99-107
    134 国家统计局.中国统计年鉴.中国统计出版社,2005
    135 教育部.中国教育统计年鉴.人民教育出版社,1998
    136 国家技术监督局.GB/T15834-1995标点符号用法.中国标准出版社,1995
    137 A.Vinciarelli.Offline Cursive Handwriting:From Word to Text Recognition.Ph.d.thesis,University of Bern.2003
    138 F.Yin,C.-L.Liu.Handwritten Text Line Segmentation by Clustering with Distance Metric Learning.Proceedings of the 11th International Conference on Frontiers in Handwriting Recognition.2008:accepted
    139 J.Allen.Natural Language Understanding(Second Edition).Benjamin Cummings,1995
    140 L.Y.Tseng,R.C.Chen.Segmenting Handwritten Chinese Characters Based on Heuristic Merging of Stroke Bounding Boxes and Dynamic Programming.Pattern Recognition Letters.1998,19(10):963-973
    141 C.Hong,G.Loudon,Y.Wu,et al.Segmentation and Recognition of Continuous Handwriting Chinese Text.International Journal of Pattern Recognition and Artificial Intelligence.1998,12(2):223-232
    142 Y.-H.Tseng,H.-J.Lee.Recognition-based Handwritten Chinese Character Segmentation Using a Probabilistic Viterbi Algorithm.Pattern Recognition Letters.1999,20(8):791-806
    143 J.Gao,X.Ding,Y.Wu.A Segmentation Algorithm for Handwritten Chinese Character Strings.Proceedings of the 5th International Conference on Document Analysis and Recognition.1999:633-636
    144 J.Xue,X.Ding,C.Liu,et al.Location and Interpretation of Destination Addresses on Handwritten Chinese Envelopes.Pattern Recognition Letters.2001,22(6):639-656
    145 C.-L. Liu, M. Koga, H. Fujisawa. Lexicon-driven Segmentation and Recognition of Handwritten Character Strings for Japanese Address Reading. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002, 24(11):1425-1437
    
    146 S. Zhao, Z. Chi, P. Shi, et al. Two-stage Segmentation of Unconstrained Handwritten Chinese Characters. Pattern Recognition. 2003, 36(1): 145-156
    
    147 G.-H. Li, P.-F. Shi. An Approach to Offline Handwritten Chinese Character Recognition Based on Segment Evaluation of Adaptive Duration. Jounal of Zhejiang University Science. 2004, 5(11): 1392-1397
    
    148 Z. Liang, P. Shi. A Metasynthetic Approach for Segmenting Handwritten Chinese Character Strings. Pattern Recognition Letters. 2005, 26(10):1498-1511
    
    149 Y. Jiang, X. Ding, Z. Ren. Substring Alignment Method for Lexicon Based Handwritten Chinese String Recognition and its Application to Address Line Recognition. Proceedings of the 18th International Conference on Pattern Recognition. 2006:683-686
    
    150 H. Ikeda. A Recognition Method for Touching Japanese Handwritten Characters. Proceedings of the 5th International Conference on Document Analysis and Recognition. 1999:641-644
    
    151 M. Hamanaka, K. Yamada, J. Tsukumo. Normalization-cooperated Feature Extraction Method for Handprinted Kanji Character Recognition. Proceedings of the 3rd International Workshop on Frontiers of Handwriting Recognition.1993:343-348
    
    152 K. Mikolajczyk, C. Schmid. A Performance Evaluation of Local Descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2005,27(10):1615-1630
    
    153 T. Iijima, H. Genchi, K. Mori. A Theoretical Study of Pattern Identification by Matching Method. Proceedings of the 1st USA- JAPAN Computer Conference.Tokyo, Japan, 1972:42-48
    
    154 M. Yasuda, H. Fujisawa. An Improved Correlation Method for Character Recognition. Systems, Computers and Control. 1979, 10(2):29-38 (translated from Transactions on IEICE Japan. 1979, J62-D(3):217-224)
    
    155 A. J. Viterbi. Error Bounds for Convolutional Codes and an Asymptotically Optimal Decoding Algorithm. IEEE Transactions on Information Theory. 1967,IT-13(4):260-269
    156 P. Natarajan, Z. Lu, R. M. Schwartz, et al. Multilingual Machine Printed OCR.International Journal of Pattern Recognition and Artificial Intelligence. 2001,15(1):43 - 63
    
    157 A. C. Rencher. Methods of Multivariate Analysis. John Wiley and Sons, 2002
    
    158 J. E. Freund. Modern Elementary Statistics. Prentice-Hall, 1984
    
    159 R. V. D. Heiden, F. C. A. Groen. The Box-Cox Metric for Nearest Neighbour Classification Improvement. Pattern Recognition. 1997, 30(2):273-279
    
    160 S. Tulyakov, S. Jaeger, V. Govindaraju, et al. Review of Classifier Combination Methods. Machine Learning in Document Analysis and Recognition.2008:361-386
    
    161 L. Breiman. Bagging Predictors. Machine Learning. 1996, 23(2): 123-140
    
    162 C.M. Bishop. Pattern Recognition and Machine Learning. Springer Verlag,2006
    
    163 R. E. Schapire. The Strength of Weak Learnability. Machine Learning. 1990,5(2): 197-227
    
    164 Y. Freund. Boosting a Weak Learning Algorithm by Majority. Information and Computation. 1994, 141(2):256-285
    
    165 Y. Freund, R. E. Schapire. Experiments with a New Boosting Algorithm. Proceedings of the 13th International Conference on Machine Learning. 1996:148—156
    
    166 L. Xu, A. Krzyzak, C. Y. Suen. Methods for Combining Multiple Classifiers and Their Applications to Handwriting Recognition. IEEE transactions on System,Man, and Cybernetics. 1992, 23(3):418-435
    
    167 L. I. Kuncheva. Combining Pattern Classifiers: Methods and Algorithms. John Wiley and Sons, 2004
    
    168 C.-L. Liu. Classifier Combination Based on Confidence Transformation. Pattern Recognition. 2005, 38(1):11-28
    
    169 A. P. Dempster. Upper and Lower Probabilities Induced by a Multi-valued Mapping. Annals Mathematical Statistics. 1967, 38(2):325-339
    
    170 G. Shafer. A Mathematical Theory of Evidence. Princeton University Press,1976
    
    171 A. F. R. Rahman, M. C. Fairhurst. Multiple Classifier Decision Combination Strategies for Character Recognition: A Review. International Journal on Document Analysis and Recognition. 2003, 5(4): 166-194
    172 S.Madhvanath,V.Govindaraju.The Role of Holistic Paradigms in Handwritten Word Recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence.2001,23(2):149-164
    173 M.Mohamed,R Gader.Handwritten Word Recognition Using Segmentation-free Hidden Markov Modeling and Segmentation-based Dynamic Programming Techniques.IEEE Transactions on Pattern Analysis and Machine Intelligence.1996,18(5):548-554
    174 A.L.Koerich,R.Sabourin,C.Y.Suen.Recognition and Verification of Unconstrained Handwritten Words.IEEE Transactions on Pattern Analysis and Machine Intelligence.2005,27(10):1509-1521
    175 S.Gunter,H.Bunke.Off-line Cursive Handwriting Recognition Using Multiple Classifier Systems-on the Influence of Vocabulary,Ensemble,and Training Set Size.Optics and Lasers in Engineering.2005,43(3-5):437-454
    176 张彬,金连文.基于AdaBoost的手写体汉字相似字符识别.第26届中国自动控制大会.张家界,2007:576-579
    177 U.-V.Marti,H.Bunke.Use of Positional Information in Sequence Alignment for Multiple Classifier Combination.Proceedings of the 2nd International Workshop on Multiple Classifier Systems.Cambridge,England,2001:388-398
    178 R.Bertolami,H.Bunke.Ensemble Methods for Handwritten Text Line Recognition Systems.IEEE International Conference on Systems,Man and Cybernetics.2005:2334-2339
    179 J.Fiscus.A Post-processing System to Yield Reduced Word Error Rates:Recognizer Output Voting Error Reduction.Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding.Santa Barbara,1997:347-352
    180 R.Bertolami,H.Bunke.Multiple Classifier Methods for Offline Handwritten Text Line Recognition.Proceedings of the 7th International Workshop on Multiple Classifier Systems.Prague,Czech Republic,2007:72-81
    181 R.Bertolami,H.Bunke.Hidden Markov Model-based Ensemble Methods for Offline Handwritten Text Line Recognition.Pattern Recognition.2008,41(11):3452-3460

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700