基于ANN和HMM模型的口吃语音识别研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于ANN和HMM模型的口吃语音识别研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：The Research of Stuttered Speech Recognition Based on ANN and HMM
作者：方芳
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：口吃识别 ; 口吃特征提取 ; 神经网络 ; 隐马尔科夫模型 ; 灰色关联度 ; 分段K均值算法
英文关键词：Stuttering Recognition ; Features Extraction ; Artificial Neural Network ; Hidden Markov Model ; Gray Relational Analysis ; K-means Procedure
学位年度：2010
导师：许林英
学科代码：081203
学位授予单位：天津大学
论文提交日期：2010-12-01

摘要

口吃是一种言语疾病，随着人工智能的发展，计算机的普及以及智能医疗的需求，智能识别口吃类型逐渐被提到日程上来并具有重要研究意义。
     本文基于语音识别的基础，结合口吃语音的特点选择谱包络作为口吃语音特征参数，构建人工神经网络（ANN）和隐马尔科夫模型（HMM）实现口吃语音自动识别的过程。论文首先介绍了语音识别的基础及当前语音识别的发展情况，并分析了口吃语音识别研究的历史、现状与难点以及口吃识别分类的流程方法。本文构建的口吃语音库语音类别共有四种，分别为停顿语音，重复语音，拉长语音和流利语音，结合当前研究现状采取了两种手动切割方法获取口吃语音，对语音进行预处理，包括预加重、平稳性分帧，然后提取谱包络特征系数LPCC作为参数并采取灰色关联度算法和等部分划分方法进行规整。接下来论文详细讨论应用神经网络和隐马尔科夫模型进行口吃语音识别的分析过程和设计思想：神经网络选取三层感知器前馈结构，并采取误差反向传播算法对口吃语音进行训练和识别；隐马尔科夫模型选取从左到右的连续模型并建立了对应不同口吃类别的四个模型，应用Baum-Welch算法训练，其中采用了分段K均值算法优化，最后使用Viterbi算法进行识别。
     论文最后进行算法实现及实验，实验结果表明口吃类别的识别率较为理想。论文结尾总结了实验中的不足和存在的问题以及今后口吃识别的发展前景。
Stuttering is a speech disease, with the development of artificial intelligence, computer popularity and intelligent medical, automatic stuttering recognition has significance.
     Based on speech recognition and the characteristics of stuttering speech, this paper extracts feature parameters, and builds the artificial neural network (ANN) and hidden Markov model (HMM) to realize automatic stuttering recognition. First, this paper introduces the basis of speech recognition, the current developments and difficult in speech recognition, then introduces the details of the classification process method for identifying stuttering. The stuttering speech database in this paper includes blocked speech, repetition speech, prolonged speech and fluent speech. According to the current research this paper takes two manual cutting methods to get stuttering speech, preprocesses speech which including pre-emphasis, stability framing, and then combining the language model and acoustic characteristics of stuttering, extracts spectral envelope features LPCC as parameters and takes two means which are gray relational analysis and the uniform dividing method to structure it. This paper detailed discusses the analysis and design of applying ANN and HMM to recognize stuttering speech. When using three-layer perception neural network we choose back propagation algorithm for network training and recognition. When using continuous and left to right HMM to identify stuttering, we should establish four stuttering models for different types, each model has six states. Baum-Welch algorithm is applied to train HMM, K-means algorithm is conducted to train the observation probability distribution of HMM parameters. Finally we use Viterbi algorithm to identify stuttering classification.
     Finally, we had experiments, analyzed the experiments results, the result indicates that the experiments have good data classification ability and recognition ability. In the end of this paper summed up shortcomings and problems of experiments and future prospects for the stuttering recognition development.

引文

[1]拉宾纳L.R.语音识别的基本原理，清华大学出版社，2002
    [2]赵力，语音信号处理，机械工业出版社，2003
    [3]李宏松，苏健民，黄英来，于慧伶，基于声音信号的特征提取方法的研究，信息技术，2006，第01期
    [4]张志刚，基于神经网络/HMM的语音识别算法的研究：[硕士学位论文]，武汉：武汉理工大学，2006
    [5]冯洁，非特定人连续数字语音识别研究：[硕士学位论文]，大连：大连理工大学，2007
    [6]万茂文，基于DSP语音频谱包络提取方法的研究及应用：[硕士学位论文]，长沙：中南大学，2008
    [7]王明奇，基于HMM的孤立词语音识别系统的研究：[硕士学位论文]，长沙：中南大学，2007
    [8]李进娟，基于HMM模型的语音情感识别的研究：[硕士学位论文]，天津：天津大学，2007
    [9] K均值算法，/www.docin.com/p-20728766.html
    [10]隐马尔科夫模型HMM自学，http://lfzhs.javaeye.com/blog/678611
    [11]苏博，刘鲁，杨方廷，基于灰色关联分析的神经网络模型，系统工程理论与实践，2008，第9期
    [12]张宝轩，邵献之，基于ANN的汉语数字语音识别，山东电子，1996，第1期
    [13]陈卫东，王肖亚，解静，基于LPCC的多语种识别算法，信号与信息处理
    [14]余建潮，张瑞林，基于MFCC和LPCC的说话人识别，计算机工程与设计，2009，30（5）
    [15]吴炜烨，熊红云，神经网络在语音识别中的应用研究，仪器仪表用户，2009，16卷1期
    [16]张力朋，李立梅，一种用于语音识别的神经网络，北京邮电大学学报，1995，18卷1期
    [17]魏星，周萍，语音识别系统及其特征参数的提取研究，计算机与现代化，2009，第9期
    [18]荣薇，陶智，顾济华等，基于改进LPCC和MFCC的汉语耳语音识别，计算机工程与应用，2007，43(30)
    [19] BP神经网络的数据分类语音特征信号识别，http://www.ilovematlab.cn/book/neural/chapter1.html
    [20] Matlab隐马尔可夫模型应用，http://www.ymlib.net/article/sort010/info-732.html
    [21] HMM示例及Matlab计算，http://summerbell.javaeye.com/blog/390271
    [22]语音识别关键技术研究，http://www.360doc.com/content/10/0117/09/390124_13780858.shtml
    [23] HMM在语音识别中的应用，http://wenku.baidu.com/view/c80350d184254b35eefd3408.html
    [24]使用隐马尔科夫模型(HMM)进行语音识别，http://hi.baidu.com/zhourunfa66/blog/item/456a9f18ae2be50e35fa4133.html
    [25]基于MATLAB的BP神经网络的设计与训练，http://wenku.baidu.com/view/8675a87da26925c52cc5bfce.html
    [26] K.M. Ravikumar, Balakrishna Reddy, R. Rajagopal,and H.C. Nagaraj, Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies,World academy of Science,Engineering and Technology, 2008, 46
    [27] Howell P, Sackin S, Glenn K. Development of a two-stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: I. Psychometric procedures appropriate for selection of training material for lexical dysfluency classifiers, Journal of Speech, Language, and Hearing Research 1997, 40:1073–1084
    [28] Lim Sin Chee, Ooi Chi Ai, Ma Hariharan, etc. Automatic Detection of Prolongations and Repetitions using LPCC,International Conference for Technical Postgraduates, 2009
    [29] Lim Sin Chee, Ooi Chia Ai, Sazali Yaacob, etc. Overview of Automatic Stuttering Recognition System, Proceedings of the International Conference on Man-Machine Systems, 2009, 11–13
    [30] Howell P, Au-Yeung J, Sackin S, Glenn K, Rustin L. Detection of supralexical dysfluencies in a text read by child stutterers. Journal of Fluency Disorders 1997, 299–307
    [31] Peter Howell, Stevie Sackin, Automatic Recognition of Repetitions and Prolongations in Stuttered Speech
    [32] Howell P, Sackin S, Glenn K, Development of a Two-Stage Procedure for the Automatic Recognition of Dysfluencies in the Speech of Children Who Stutter: II. ANN Recognition of Repetitions and Prolongations with Supplied Word Segment Markers. Journal of Speech, Language and Hearing, Research 1997, 1085–1096
    [33] WSZO(?)EK W, TADEUSIEWICZ R. The Evaluation of Effectiveness of Various Neural Network Types in Pathological Speech Analysis. XLVII Open Seminar on Acoustics OSA`2000, 2000,721-728
    [34] K. Farrell, R. Mammone, K.Assaleh,Speaker Recognition Using Neural Networks and Conventional Classifiers, IEEE, Transactions on speech and audio processing,1994 ,194-205
    [35] Andrzej, Czyzewski, Andrzej Kazmarek, etc. Intelligent Processing of Stuttered Speech. Journal of Intelligent Information Systems, 2003, 21(2):143-171
    [36] P. Pallabi and T. Bhavani,Face Recognition Using Multiple Classifiers, in Tools with Artificial Intelligence, 2006. ICTAI '06. 18th
    [37] IEEE International Conference on, 2006, pp. 179-186.H. Hammady, S. Abdou, M. Shahin, etc. An HMM System for Recognizing Articulation Features for Arabic Phones, 2008, 125-130
    [38] Ooi Chia Ai, J. Yunus, Computer-based System to Assess Efficacy of Stuttering Therapy Techniques, in Proceeding of 3rd Kuala Lumpur International Conference on Biomedical Engineering, Kuala Lumpur, 2006, 395-398
    [39] Ooi Chia Ai, J. Yunus,Overview of a Computer-based Stuttering Therapy, in Regional Postgraduate Conference on Engineering an Science (RPCES 2006), Johore, 2006, 207-211
    [40] K. Ravikumar, B. Reddy, R. Rajagopal, etc. Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies, in Proceedings of World Academy Science, Engineering and Technology, 2008, 270-273
    [41] wietlicka, W. Kuniszyk- Neural Networks in the Disabled Speech Analysis, in Computer Recognition System 3. vol. 57/2009: Springer Berlin / Heidelberg, 2009, 347-354
    [42] L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, vol. 77 ,1989, 257-286
    [43] - Automatic Detection of Disorders in a Continuous Speech with the Hidden Markov Models Approach, in Computer Recognition Systems 2. vol. 45/2008: Springer Berlin / Heidelberg, 2007, 445-453
    [44] MATLAB 2002
    [45] JelinkF, Continuous speeeh recognition by statistieal methods, Proeeedings of the IEEE 1976, 64(4):532-556

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700