基于深度学习的语音识别方法研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

基于深度学习的语音识别方法研究

详细信息查看全文 | 推荐本文 |

英文篇名：Research on speech recognition based on depth learning
作者：邵娜 ; 李晓坤 ; 刘磊 ; 陈虹旭 ; 郑永亮 ; 杨磊
英文作者：SHAO Na;LI Xiaokun;LIU Lei;CHEN Hongxu;ZHENG Yongliang;YANG Lei;Heilongjiang Hengxun Technology Co.,Ltd.Postdoctoral Programme;Engineering Research Center Of Smart Media,Heilongjiang Province(Heilongjiang Hengxun Technology o.,Ltd.);
关键词：语音识别 ; 深度学习 ; 深度神经网络
英文关键词：speech recognition;;depth learning;;Deep Neural Network
中文刊名：DLXZ
英文刊名：Intelligent Computer and Applications
机构：黑龙江恒讯科技有限公司国家博士后科研工作站;黑龙江省智慧媒体工程技术研究中心(黑龙江恒讯科技有限公司);
出版日期：2019-02-18
出版单位：智能计算机与应用
年：2019
期：v.9
基金：中小企业创新基金(2017FF1GJ023);; 专利优势示范企业基金(2017YBQCZ029);; 国家自然科学基金(81273649)
语种：中文;
页：DLXZ201902031
页数：8
CN：02
ISSN：23-1573/TN
分类号：143-150

摘要

本文主要介绍了深度学习和语音识别技术的发展历史和发展现况,研究意义及目的。通过深度学习技术建立声学模型,从而引入语音识别技术中分析其发展前景。本文希望通过对基于深度学习的语音识别方法研究,将语音识别的效率优化,准确率提高,从而促进语音识别技术的发展。
The paper mainly introduces the history and development of deep learning and speech recognition technology,and elaborates significance and purpose of the study. The acoustic model is established by deep learning technology to introduce the development prospect of the speech recognition technology. In this paper,based on deep learning,the research could improve the efficiency and the accuracy of speech recognition,so as to promote the development of speech recognition technology.

引文

[1]赵力.语音信号处理[M].3版.北京:机械工业出版社,2016.
    [2]JUANG B H,RABINER L R.Automatic speech recognition-a brief history of the technology development[EB/OL].[2015-01-17].http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf.
    [3]BENESTY J,SONDHI M M,HUANG Yiteng.Springer handbook of speech processing[M].Berlin/Heidelberg:Springer Science&Business Media,2008.
    [4]RABINER L R.First-Hand:The_Hidden_Markov_Model[EB/OL].[2015-01-12].https://ethw.org/First-Hand:The_Hidden_Markov_Model.
    [5]PINOLA M.Speech recognition through the decades:How we ended up with Siri[EB/OL].[2017-07-28].https://www.pcworld.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html.
    [6]KURZWEIL R.KurzweilAINetwork[EB/OL].[2014-09-25].http://www.kurzweilai.net/ray-kurzweil-biography.
    [7]HERVB,MORGAN N.Connectionist speech recognition:Ahybrid approach[M].Boston:Kluwer Academic Publishers,1994.
    [8]BAKER J M,LI Deng,GLASS J R,et al.Developments and directions in speech recognition and understanding,Part 1[J].IEEE Signal Processing Magazine,2009,26(3):75-80.
    [9]孙志军,薛磊,许阳明,等.深度学习研究综述[J].计算机应用研究,2012,29(8):2806-2810.
    [10]Kaldi.History of the Kaldi project[EB/OL].[2017-07-26].http://www.kaldi-asr.org/doc/history.html.
    [11]GRAVES A,LIWICKI M,FERNNDEZ S,et al.A novel connectionist system for unconstrained handwriting recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(5):855-868.
    [12]SAK H,SENIOR A,BEAUFAYS F.Long short-term memory recurrent neural network architectures for large scale acoustic modeling[C]//Interspeech 2014.Singapore:ISCA,2014:338-342.
    [13]LI Xiangang,WU Xihong.Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition[J].arXiv preprint arXiv:1410.4281,2014.
    [14]LECUN Y.LeNet-5,convolutional neural networks[EB/OL].[2013-11-16].http://sites.google.com/site/chumerin/projects/mycunn.
    [15]ZHANG Wei.Shift-invariant pattern recognition neural network and its optical architecture[C]//Proceedings of annual conference of the Japan Society of Applied Physics,1988.
    [16]ZHANG W,ITOH K,TANIDA J,et al.Parallel distributed processing model with local space-invariant interconnections and its optical architecture[J].Applied Optics,1990,29(32):4790-4797.
    [17]MATSUGU M,MORI K,MITARI Y,et al.Subject independent facial expression recognition with robust face detection using a convolutional neural network[J].Neural Networks,2003,16(5):555-559.
    [18]VAN DEN OORD A,DIELEMAN S,SCHRAUWEN B.Deep content-based music recommendation(PDF)[C]//NIPS'13Proceedings of the 26th International Conference on Neural Information Processing Systems.Lake Tahoe,Nevada:ACM,2013,2:2643-2651.
    [19]COLLOBERT R,WESTON J.A unified architecture for natural language Processing:Deep neural networks with multitask learning[C]//ICML,volume 307 of ACM International Conference Proceeding Series.Helsinki,Finland:ACM,2008:160-167.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700