语音信号端点检测方法综述及展望

英文篇名：Summary and survey of endpoint detection algorithm for speech signals
中文刊名：计算机应用研究
英文刊名：Application Research of Computers
作者：刘华平 ; 李昕 ; 徐柏龄 ; 姜宁
英文作者：LIU Hua-ping1 ; LI Xin1 ; 2 ; 3 ; XU Bo-ling3 ; JIANG Ning1 (1.School of Electromechanical Engineering & Automation ; Shanghai University ; Shanghai 200072 ; China ; 2.State Key Lab of Pattern Recog-nition ; Institute of Automation ; Chinese Academy of Sciences ; Beijing 100080 ; China ; 3.Dept.of Electronic Science & Engineering ; Nanjing University ; Nanjing 210093 ; China)
中文关键词：语音信号处理 ; 端点检测 ; 鲁棒性
英文关键词：speech signals processing ; endpoint detection ; robustness
出版日期：2008-08-15
机构：上海大学机电工程与自动化学院;南京大学电子科学与工程系;
年：2008
期：08
出版单位：计算机应用研究

摘要

端点检测是语音信号处理过程中非常重要的一步,它的准确性直接影响语音信号处理的速度和结果,因此端点检测方法的研究,特别是在噪声环境下端点检测的研究,一直是语音信号处理中的热点。从基于时域参数、频域参数、时频参数、模型匹配等方法的角度,较全面地回顾了端点检测方法的发展历程,对各种方法的优缺点进行了比较分析,并给出了这些方法的改进意见,对端点检测未来的研究方向进行了展望。
Endpoint detection,whose accuracy directly affects the speed and the results,is a very important step for speech signals processing.The research of endpoint detection algorithm is a hotspot of speech signals processing,especially in adverse environment.This paper introduced different methods based on time domain,frequency domain,time-frequency domain parameters and model matching algorithms.Meanwhile,it analyzed the advantages,drawbacks and improvements of those algorithms.It also discussed the prospects of endpoint detection.

引文

[1]LAMEL L F,RABINER L R,ROSENBERG A E,et al.An im-proved endpoint detector for isolated word recognition[J].IEEETrans on Acoust,Voice,Signal Processing,1981,29(8):777-785.
    [2]LULie,JIANG Hao,ZHANG Hong-jiang.A robust audio classifica-tion and segmentation method[C]//Proc of the 9th ACM Internatio-nal Conference on Multimedia.2001.
    [3]SAVOJI MH.A robust algorithm for accurate endpointing of speech[J].Speech Communications,1989,8(1):45-60.
    [4]贾川,张健,陈振标,等.噪声环境下的端点检测算法研究[C]//第六届全国人机语音通信学术会议论文集.2001:441-445.
    [5]RABINER L R,SAMBUR M R.An algorithm for determining theendpoints of isolated utterances[J].Bell System Technical Jour-nal,1975,54(2):297-315.
    [6]张仁志,崔慧娟.基于短时能量的语音端点检测算法研究[J].电声技术,2005(7):52-54.
    [7]肖述才,王作英.端点检测中的一种新的对数能量特征[J].电声技术,2004(6):37-41.
    [8]李明远,李建东.基于相关性的语音激活检测器[J].电声技术,1995(11):6-9.
    [9]陈斐利,朱杰.一种新的基于自相关相似距离的语音信号端点检测方法[J].上海交通大学学报,1999,33(9):1097-1099.
    [10]卢艳玲,侯榆青,王宾,等.一种基于多特征的带噪语音信号端点检测与音节分割算法[J].电声技术,2005(7):60-62.
    [11]NEY H.An optimization algorithm for determining the endpoints ofisolated utterances[C]//Proc of ICASSP.1981:720-723.
    [12]刘庆升,徐霄鹏,黄文浩.一种语音端点检测方法的探究[J].计算机工程,2003,29(3):120-121.
    [13]RABINER L R,SAINBUR M R.Voiced unvoiced silence detectionusing the Itakura LPC distance measure[C]//Proc of ICASSP.1977:323-326.
    [14]SHEN J L,HUNG J W,LEE L S.Robust entropy-based endpointdetection for speech recognition in noisy environments[C]//Proc ofInternational Conference on Spoken Language Processing.Sydney:[s.n.],1998:232-238.
    [15]于迎霞,史家茂.一种改进的基于倒谱特征的带噪端点检测方法[J].计算机工程,2004,30(19):85-87.
    [16]SHANNON C E.A mathematical theory of communication[J].BellSyst Technical Journal,1948,27:379-423.
    [17]WUBing-fei,WANG Kun-ching.Robust endpoint detection algorithmbased on the adaptive band-partitioning spectral entropy in adverse en-vironments[J].IEEE Trans on Speech and Audio Processing,2005,13(5):762-775.
    [18]WUG D,LINC T.Word boundary detection with Mel-scale frequen-cy bank in noisy environment[J].IEEE Trans on Speech and Au-dio Processing,2000,8(5):541-554.
    [19]李晔,张仁智,崔慧娟,等.低信噪比下基于谱熵的语音端点检测算法[J].清华大学学报:自然科学版,2005,45(10):1397-1400.
    [20]吴军,王作英.汉语信息熵和语言模型的复杂度[J].电子学报,1996,24(10):69-71.
    [21]田野,王作英,陆大金.基于子带能量线性映射的噪声中端点检测算法[J].清华大学学报:自然科学版,2002,42(7):953-956.
    [22]王让定,柴佩琪.一种基于谱熵的语音端点检测改进方法[J].信息与控制,2004,33(1):77-81.
    [23]陈四根,和应民.一种基于信息熵的语音端点检测方法[J].应用科技,2001,28(3):13-14.
    [24]HUANG Liang-sheng,YANG C H.A novel approach to robustspeech endpoint detection in carenvironments[C]//Proc of ICASSP.2000:1751-1754.
    [25]JUNQUA J C,MAK B,REAVES B.A robust algorithm for wordboundary detection in the presence of noise[J].IEEE TransSpeech Audio Processing,1994,2(3):406-412.
    [26]郭继云,王守觉,刘学刚.一种改进的基于频能比的端点检测算法[J].计算机工程与应用,2005,41(29):91-93.
    [27]朱杰,韦晓东.噪声环境中基于HMM模型的语音信号端点检测方法[J].上海交通大学学报,1998,32(10):14-16.
    [28]徐筱麟,张兴国.一种基于马可夫过程统计模型的语音激活检测方法[J].解放军理工大学学报:自然科学版,2003,4(1):7-10.
    [29]董恩清,赵鹤鸣,周亚同,等.支持向量机在语音激活检测中的应用研究[J].通信学报,2003,24(3):70-77.
    [30]范万春,施仁,孙煜,等.应用统计模型的地震信号端点检测方法[J].西安交通大学学报,2001,35(4):365-369.
    [31]SOHN J,KIMMN S,SUNG W.A statistical model-based voice ac-tivity detection[J].IEEE Signal Processing Letters,1999,6(1):1-3.
    [32]QI Ying-yong,HUNT B R.Voiced-unvoiced-silence classification ofspeech using hybrid features and a network classifier[J].IEEE Transon Speech and Audio Processing,1993,1(2):250-255.
    [33]KIA S J,COGHILL G G.A mapping neural network and its applica-tion to voiced-unvoiced-silence classification[C]//Proc of the 1stNew Zealand Int Two-Stream Conf Artificial Neural Networks ExpertSystems.1993:104-108.
    [34]GHISELLI-CRIPPA T,EL-JAROUDI A.A fast neural net trainingalgorithm and its application to voiced-unvoiced-silence classificationof speech[C]//Proc of Int Confon Speech Language Processing.1991:441-444.
    [35]魏涛,顾涵铮.一种基于声学分类的语音激活检测算法[J].合肥工业大学学报:自然科学版,2001,24(2):222-225.
    [36]邝航宇,张军,韦岗.一种基于检测元音的孤立词端点检测算法[J].电声技术,2005(3):40-43.
    [37]刘鹏,王作英.多模式语音端点检测[J].清华大学学报:自然科学版,2005,45(7):896-899.
    [38]VATIKIOTI-BATESON E,BAILLY G,ERRIER P.Audio visualspeech processing[M].[S.l.]:MITPress,2007.
    [39]丁琦,徐望,王炳锡.基于模糊分类器的能量可变噪声环境下的词边界检测[J].电声技术,2003(5):45-49.
    [40]BERITELLI F.A robust endpoint detector based on differential pa-rameters and fuzzy pattern recognition[C]//Proc of ICSP.1998:601-604.
    [41]赵力.语音信号处理[M].北京:北京机械工业出版社,2003.
    [42]张蕾.电脑也能读唇语[EB/OL].http://www.people.com.cn/GB/it/53/142/20030501/983126.html.
    [43]英特尔推出读唇语的AVSR软件[EB/OL].(2003-04-30).ht-tp://article.pchome.net/content-6819.htm.
    [44]可读唇语手机[EB/OL].(2002-04-11).http://www.zaobao.com/special/newspapers/2002/04/hfwb110402.html.
    [45][EB/OL].(2004-04-09).http://computer.online.sh.cn/compu-ter/gb/content/2002-04/09/content_32581 2.htm.