摘要
在对汉字信息进行处理的过程中,汉字词汇本身的构成具备一定的特殊性,无法实现像英文一样根据空格来分隔单词。因此,对于一整段汉字进行切分形成独立词语是中文信息处理上难点所在。论文主要对中文分词算法以及相应的中文分词技术进行分析和研究,帮助更多人了解到中文分词算法在搜索引擎中的应用。
In the process of processing Chinese characters' information, the formation of Chinese character vocabulary itself has certain particularity, so it is difficult to separate words according to spaces as in English. Therefore, the fragmentation of a whole Chinese character into independent words is the difficulty of Chinese information processing. The paper mainly analyzes and studies the Chinese word segmentation algorithm and the corresponding Chinese word segmentation technology, so as to help more people understand the application of Chinese word segmentation algorithm in search engine.
引文
[1]申兵一,巩青歌.中文分词技术在搜索引擎中的应用研究[J].计算机与网络,2010,36(1):60-63.
[2]董妍汝.中文分词技术在搜索引擎中的应用[J].办公自动化,2010(4):44-45.
[3]刘红芝.中文分词技术的研究[J].电脑开发与应用,2010,23(3):1-3.
[4]周宏宇,张政.中文分词技术综述[J].安阳师范学院学报,2010(2):54-56.