改进的HMM系统在英语语音合成中的研究

英文篇名：Study for English Synthesis with Improved HMM Synthesis Systems
中文刊名：太原理工大学学报
英文刊名：Journal of Taiyuan University of Technology
作者：张雪英 ; 陈洁 ; 孙颖
英文作者：ZHANG Xue-ying ; CHEN Jie ; SUN Ying(College Department of Information Engineering ; TUT ; Taiyuan 030024 ; China)
中文关键词：语音信号处理 ; HMM ; 可训练语音合成 ; 英语合成
英文关键词：speech signal processing ; HMM(Hidden Markov Models) ; trainable speech synthesis ; English synthesis
出版日期：2012-01-15
机构：太原理工大学信息工程学院;
年：2012
期：01
出版单位：太原理工大学学报

摘要

根据英语语言所具有的一些特性对HMM模型进行改进,设计出适合英语语音合成的上下文属性集以及用于模型聚类的问题集,提高了其建模和训练效果。此外,借助HTK和Festival等工具,以基频和声道谱参数为训练参数,最终实现此英语语音合成系统。从所合成语句的效果来看,合成语音整体稳定流畅,而且节奏感比较强。
Speech synthesis is one of the key problems to realize human-machine interaction.The HMM-based speech synthesis could construct a synthesis system in such a short period,so as to achieve the purpose of diverse speech synthesis.In this paper the HMM-based trainable speech synthesis was applied for English application.The contextual features and corresponding question set for tree-based HMM clustering were designed by considering the characteristics of English,to improve the effect of HMM modeling and training.In addition,with the help of HTK and Festival,the English speech synthesis system was finally achieved taking the fundamental frequency and the sound channel parameter as the training paramenters.From the evaluation results of the final system,the synthesized voice turned out to be clear and fluent,and the rhythm was strong.

引文

[1]CAMPBELL W N,BLACK A W.Prosody and the selection of source unit for concatenative synthesis[M].Springer Ver-lag:Progress in Spreech Synthesis,1996.
    [2]RABINER L R.A tutorial on hidden Markov models and selected applications in speech recognition[J].Proceedings of theIEEE,1989,77(2):257-286.
    [3]冯志红,张连海,吴保民.基于HMM的英语文语转化系统[J].信息工程大学学报,2008,9(1):31-35.
    [4]Zen Heiga,Takashi,Nose,Yamagishi.The HMM-based speech synthesis system(HTS)version 2.0[C]∥Proc of ISCABonn Germany,Germany,2007:22-24.
    [5]HUANG X,ACERO A.Recent improvements on Microsoft’s trainable text-to-speech system-whistler[C]∥Proc of IC-ASS,1997:959-963.
    [6]吴义坚.基于隐马尔科夫模型的语音合成技术研究[D].合肥:中国科学技术大学,2010.
    [7]王碧泉,陈祖荫.模式识别:理论、方法和应用[M].北京:地震出版社,1989:23-44.
    [8]朱维彬.支持重音合成的汉语语音合成系统[J].中文信息学报,2007,5(3):122-124.
    [9]邵艳波,韩纪庆.自然风格语言的汉语重音自动判别研究[J].声学学报,2006,1(3):203-205.
    [10]段全盛,康世胤.一种适合HMM汉语语音合成的建模单元挑选算法[C]∥第十届全国人机语音通讯学术会议论文集,2009:87-88.