基于神经网络学习的统计机器翻译研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

基于神经网络学习的统计机器翻译研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Neural Network Learning for Statistical Machine Translation
作者：杨南
论文级别：博士
学科专业名称：信号与信息处理
中文关键词：统计机器翻译 ; 人工神经网络 ; 深度学习
英文关键词：Statistical Machine Translation ; Artificial Neural Network ; Deep Learning
学位年度：2014
导师：俞能海
学科代码：081002
学位授予单位：中国科学技术大学
论文提交日期：2014-05-01

摘要

近年来,统计机器翻译(Statistical Machine Translation, SMT)研究蓬勃发展,机器翻译效果有了很大改善。然而,机器翻译研究也遇到了双语数据不足、缺乏有效特征表示等困难,影响词对齐、调序、翻译建模等机器翻译关键模块的进一步提升,机器翻译的效果仍不尽人意。与此同时,深度学习作为一种新的机器学习方法,能自动的学习抽象特征表示,建立输入与输出信号间复杂的映射关系,给统计机器翻译研究提供了新的思路。
     本博士论文的工作就是探索如何使用深度神经网络,对统计机器翻译中的关键问题学习能更好描述翻译现象的表示,提高统计机器翻译的性能。具体的说,本论文的主要工作和创新成果如下：
     ·提出了一种基于深层神经网络的词对齐方法。我们的模型将一个多层神经网络和一个无向概率图模型结合,有效的利用了词汇的相似性和上下文信息对词对齐进行了更准确的建模。我们考察在单语数据和双语平行语料上进行半监督和无监督训练的方法。大规模的中文到英文词对齐实验表明,本章提出的模型相较基准系统显著的改善了词对齐的效果。
     ·提出了一种基于神经网络的统计机器翻译预调序模型。本方法利用神经网络降维方法,从未标注数据学习任意调序特征的低维向量表示,然后利用一个多层神经网络,将低维特征表示和其他特征结合起来,融入到一个线性排序的调序模型中。中文到英文以及日文到英文的机器翻译实验结果表明,相比于基准系统,本文提出的基于神经网络的预调序模型上能显著提高机器翻译系统性能。
     ·提出了一种新的递归重现神经网络对翻译解码过程建模。递归重现神经网络结合递归神经网络和重现神经网络,不仅能使用全局特征对翻译对应关系进行刻画,还在翻译解码过程中动态的对翻译解码树动态生成抽象表示。我们将此模型运用到机器翻译解码过程中,并提出一种分三步的半监督训练方法对此模型进行训练。此外,我们还探索了翻译短语对的表示方法,提出了一种基于翻译置信度的短语对表示。中文到英文的翻译评测实验表明,该方法能使翻译性能获得明显提升。
     本博士论文探讨了使用神经网络学习方法改善统计机器翻译中三个主要方面的性能。针对每个具体问题,我们设计了专门的神经网络结构,对相关特征学习了特定的抽象特征表示。在将来的研究中,我们希望对这些抽象表示进行总结,利用神经网络和统计机器翻译技术探索一种普适的语言表示,用以帮助其他的自然语言处理任务。
Research on statistical machine translation (SMT) has witnessed rapid growth in recent years, leading to substantial improvement in translation quality. However, the limited amount of bilingual training data, together with the lack of effective features, have impeded further progress, affecting various key components such as word align-ment, reordering and translation modeling. Meanwhile, deep learning, as an emerging machine learning method, can automatically extract abstract feature representations, modeling complex mappings between input and output signals. This new powerful technique opens up new avenues for SMT research. In this thesis, we will explore how to leverage deep neural network to learn better representation for translation modeling.
     Specifically, this work mainly consists of the following three aspects:
     ●We propose a new deep neural network for word alignment modeling. We com-bine a multilayer neural network with a undirected probabilistic graphical model, accurately modeling word alignment by automatically exploiting lexical similar-ity and context similarity. We explore both semi-supervised and unsupervised training method for word alignment model. Large scale experiment on Chinese-English alignment task has confirmed the effectiveness of our method.
     ●We propose a neural network based reordering model for SMT. Using a neural net-work based dimension reduction technique, we learns low-dimensional embed-dings for arbitrary reorder features; through a multi-layer network, these feature embeddings are integrated with word embedding features into a linear-ordering reorder models. Experiments on Chinese-English and Japanese-English show the proposed method significantly improve over strong baseline systems.
     ●We propose a new network structure, recursive recurrent neural network, for translation modeling. Recursive recurrent neural network combines the strength of recursive and recurrent neural network, which not only can leverage arbitrary global features, but also can dynamically generate abstract representations for translation derivation tree. We apply this model to translation decoding for SMT, and propose a three-step training method for our model. Furthermore, we also investigate methods for translation pair embedding, proposing a translation con-fidence based method. Experiment on Chinese-English translation task exhibits strong improvement by using our method.
     In short, this work has investigated neural network learning for three main tasks in statistical machine translation. For each task, we have designed special neural network structures and learned task-specific feature representations. In future, we hope to merge all the representations into an unified abstract feature representation by exploiting neural network and SMT resources, and apply the learned features for other natural language processing tasks.

引文

[1]KoehnP. Statistical machine translation. Cambridge University Press,2010.
    [2]Koehn P, Och F J, Marcu D. Statistical phrase-based translation. Proceedings of Proc. NAACL,2003.
    [3]Brown P F, Pietra V J D, Pietra S A D, et al. The mathematics of statistical machine translation:Parameter estimation. Computational Linguistics,1993..
    [4]Och F J, Ney H. The alignment template approach to statistical machine translation. Computational linguistics, 2004,30(4):417-449.
    [5]Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch. JMLR,2011..
    [6]Mikolov T, Karafiat M, Burget L, et al. Recurrent neural network based language model. Proceedings of INTERSPEECH,2010.1045-1048.
    [7]Socher R, Lin C C, Ng A Y, et al. Parsing natural scenes and natural language with recursive neural networks. Proceedings of Proc. ICML,2011.
    [8]Sato S, Nagao M. Toward memory-based translation. Proceedings of Proceedings of the 13th conference on Computational linguistics-Volume 3. Association for Computational Linguistics,1990.247-252.
    [9]Och F J, Ney H. Discriminative training and maximum entropy models for statistical machine translation. Proceedings of Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Asso-ciation for Computational Linguistics,2002.295-302.
    [10]Liu Y, Liu Q, Lin S. Tree-to-string alignment template for statistical machine translation. Proceedings of Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics,2006.609-616.
    [11]Wu D. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational linguistics,1997,23(3):377-403.
    [12]Chiang D. Hierarchical phrase-based translation. Computational Linguistics,2007..
    [13]Galley M, Hopkins M, Knight K, et al. What's in a Translation Rule? Proceedings of HLT-NAACL,2004. 273-280.
    [14]Vogel S, Ney H, Tillmann C. HMM-based word alignment in statistical translation. Proceedings of Proc. ACL,1996.
    [15]Berger A L, Pietra V J D, Pietra S A D. A maximum entropy approach to natural language processing. Computational Linguistics,1996..
    [16]Moore R C. A discriminative framework for bilingual word alignment. Proceedings of Proc. HLT-EMNLP, 2005.
    [17]Berg-Kirkpatrick T, Bouchard-Cote A, DeNero J, et al. Painless unsupervised learning with features. Pro-ceedings of Proc. NAACL,2010.
    [18]Dyer C, Clark J, Lavie A, et al. Unsupervised word alignment with arbitrary features. Proceedings of Proc. ACL,2011.
    [19]Deng Y, Byrne W. HMM word and phrase alignment for statistical machine translation. Audio, Speech, and Language Processing, IEEE Transactions on,2008..
    [20]He X. Using word dependent transition models in HMM based word alignment for statistical machine trans-lation. Proceedings of Proc. the Second Workshop on Statistical Machine Translation,2007.
    [21]Liang P, Taskar B, Klein D. Alignment by agreement. Proceedings of Proc. NAACL,2006.
    [22]Marcu D, Wong W. A phrase-based, joint probability model for statistical machine translation. Proceedings of Proc. EMNLP,2002.
    [23]DeNero J, Bouchard-Cote A, Klein D. Sampling alignment structure under a Bayesian translation model. Proceedings of Proc. EMNLP,2008.
    [24]Neubig G, Watanabe T, Sumita E, et al. An unsupervised model for joint phrase alignment and extraction. Proceedings of Proc. ACL,2011.
    [25]Wuebker J, Mauser A, Ney H. Training phrase translation models with leaving-one-out. Proceedings of Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics,2010.475-484.
    [26]Xiao X, Xiong D. Max-Margin Synchronous Grammar Induction for Machine Translation. Proceedings of Proc. EMNLP,2013.
    [27]Quirk C, Menezes A. Dependency treelet translation:the convergence of statistical and example-based machine-translation? Machine Translation,2006,20(1):43-65.
    [28]Shen L, Xu J, Weischedel R M. A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model. Proceedings of ACL,2008.577-585.
    [29]Yamada K, Knight K. A syntax-based statistical translation model. Proceedings of Proceedings of the 39th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2001.523-530.
    [30]Zhang M, Jiang H, Aw A, et al. A Tree Sequence Alignment-based Tree-to-Tree Translation Model. Proceed-ings of ACL,2008.559-567.
    [31]Mi H, Huang L, Liu Q. Forest-Based Translation. Proceedings of ACL,2008.192-199.
    [32]Zhang J, Zhai F, Zong C. Augmenting string-to-tree translation models with fuzzy use of source-side syn-tax. Proceedings of Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2011.204-215.
    [33]Liu Y. A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation. Proceedings of Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers), Sofia, Bulgaria:Association for Computational Linguistics,2013.1-10.
    [34]Xiong D, Liu Q, Lin S. Maximum entropy based phrase reordering model for statistical machine translation. Proceedings of Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2006.521-528.
    [35]Tromble R, Eisner J. Learning linear ordering problems for better translation. Proceedings of Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing:Volume 2-Volume 2. Association for Computational Linguistics,2009.1007-1016.
    [36]Li P, Liu Y, Sun M. Recursive Autoencoders for ITG-based Translation.2013..
    [37]Visweswariah K, Rajkumar R, Gandhe A, et al. A word reordering model for improved machine transla-tion. Proceedings of Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2011.486-496.
    [38]Sudoh K, Wu X, Duh K, et al. Post-ordering in statistical machine translation. Proceedings of Proc. MT Summit,2011.
    [39]Goto I, Utiyama M, Sumita E. Post-ordering by parsing for Japanese-English statistical machine translation. Proceedings of Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Association for Computational Linguistics,2012.311-316.
    [40]Bengio Y, Schwenk H, Senecal J S, et al. Neural probabilistic language models. Innovations in Machine Learning,2006..
    [41]Vaswani A, Zhao Y, Fossum V, et al. Decoding with Large-Scale Neural Language Models Improves Trans-lation. Proceedings of Proceedings of the 2013 Conference on Empirical Methods in Natural Language Pro-cessing, Seattle, Washington, USA:Association for Computational Linguistics,2013.1387-1392.
    [42]Och F J. Minimum error rate training in statistical machine translation. Proceedings of Proc. ACL,2003.
    [43]Liang P, Bouchard-Cot6 A, Klein D, et al. An end-to-end discriminative approach to machine translation. Proceedings of Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2006.761-768.
    [44]Watanabe T, Suzuki J, Tsukada H, et al. Online large-margin training for statistical machine translation. Proceedings of In Proc. of EMNLP. Citeseer,2007.
    [45]Chiang D, Marton Y, Resnik P. Online large-margin training of syntactic and structural translation features. Proceedings of Proceedings of the Conference on Empirical Methods in Natural Language Processing. Asso-ciation for Computational Linguistics,2008.224-233.
    [46]Yu H, Huang L, Mi H, et al. Max-Violation Perceptron and Forced Decoding for Scalable MT Training. Proceedings of Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA:Association for Computational Linguistics,2013.1112-1123.
    [47]Feng Y, Mi H, Liu Y, et al. An efficient shift-reduce decoding algorithm for phrased-based machine transla-tion. Proceedings of Proceedings of the 23rd International Conference on Computational Linguistics:Posters. Association for Computational Linguistics,2010.285-293.
    [48]Rush A M, Collins M. Exact decoding of syntactic translation models through lagrangian relaxation. Proceed-ings of Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies-Volume 1. Association for Computational Linguistics,2011.72-82.
    [49]Papineni K, Roukos S, Ward T, et al, BLEU:a method for automatic evaluation of machine translation. Proceedings of Proc. ACL,2002.
    [50]Banerjee S, Lavie A. METEOR:An automatic metric for MT evaluation with improved correlation with human judgments. Proceedings of Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization,2005.65-72.
    [51]Isozaki H, Hirao T, Duh K, et al. Automatic evaluation of translation quality for distant language pairs. Proceedings of Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics,2010.944-952.
    [52]Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation,2006..
    [53]Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition:The shared views of four research groups. Signal Processing Magazine, IEEE,2012,29(6):82-97.
    [54]Krizhevsky A, Sutskever I, Hinton G E. ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of NIPS, volume 1,2012.4.
    [55]Bengio Y. Learning deep architectures for AI. Foundations and trends(?) in Machine Learning,2009, 2(1):1-127.
    [56]Cho Y, Saul L K. Kernel methods for deep learning. Proceedings of NIPS, volume 9,2009.342-350.
    [57]Hinton G E, Sejnowski T J. Learning and relearning in Boltzmann machines. Cambridge, MA:MIT Press, 1986,1:282-317.
    [58]Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders:Learning useful representation-s in a deep network with a local denoising criterion. The Journal of Machine Learning Research,2010, 9999:3371-3408.
    [59]Deng J, Dong W, Socher R, et al. Imagenet:A large-scale hierarchical image database. Proceedings of Computer Vision and Pattern Recognition,2009. CVPR 2009. IEEE Conference on. IEEE,2009.248-255.
    [60]Erhan D, Bengio Y, Courville A, et al. Why does unsupervised pre-training help deep learning? The Journal of Machine Learning Research,2010,11:625-660.
    [61]Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks. NIPS..
    [62]Hinton G E, Zemel R S. Autoencoders, minimum description length, and Helmholtz free energy. Advances in neural information processing systems,1994.3-3.
    [63]LeCun Y, Bengio Y. Convolutionl networks for images, speech, and time series. The handbook of brain theory and neural networks,1995,3361.
    [64]Simard P Y, Bottou L, Haffner P, et al. Boxlets:a fast convolution algorithm for signal processing and neural networks. Advances in Neural Information Processing Systems,1999.571-577.
    [65]Pearlmutter B A. Learning state space trajectories in recurrent neural networks. Neural Computation,1989, l(2):263-269.
    [66]Socher R, Huang E H, Pennington J, et al. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection. Proceedings of NIPS, volume 24,2011.801-809.
    [67]Huang E H, Socher R, Manning C D, et al. Improving word representations via global context and multiple word prototypes. Proceedings of Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics:Long Papers-Volume 1. Association for Computational Linguistics,2012.873-882.
    [68]Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781,2013..
    [69]Dahl G E, Adams R P, Larochelle H. Training restricted boltzmann machines on word observations. arXiv preprint arXiv:1202.5695,2012..
    [70]Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification:A deep learning approach. Proceedings of Proceedings of the 28th International Conference on Machine Learning (ICML-11), 2011.513-520.
    [71]Recursive deep models for semantic compositionality over a sentiment treebank.
    [72]Socher R, Bauer J, Manning C D. Parsing with compositional vector grammars. Proceedings of Proc. ACL, 2013.
    [73]Auli M, Galley M, Quirk C, et al. Joint Language and Translation Modeling with Recurrent Neural Networks. Proceedings of Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA:Association for Computational Linguistics,2013.1044-1054.
    [74]Liu L, Watanabe T, Sumita E, et al. Additive Neural Networks for Statistical Machine Translation. Proceedings of Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria: Association for Computational Linguistics,2013.791-801.
    [75]Kalchbrenner N, Blunsom P. Recurrent continuous translation models. Proceedings of Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP), Seattle, USA. Association for Computational Linguistics,2013.
    [76]Devlin J, Zbib R, Huang Z, et al. Fast and Robust Neural Network Joint Models for Statistical Machine Translation. Proceedings of ACL 2014, To appear,2014.
    [77]Socher R, Chen D, Manning C D, et al. Reasoning With Neural Tensor Networks For Knowledge Base Completion. Proceedings of NIPS.2013:.
    [78]Dahl G E, Yu D, Deng L, et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. Audio, Speech, and Language Processing, IEEE Transactions on,2012..
    [79]Smith N A, Eisner J. Contrastive estimation:Training log-linear models on unlabeled data. Proceedings of Proc. ACL,2005.
    [80]Gutmann M, Hyvarinen A. Noise-contrastive estimation:A new estimation principle for unnormalized sta-tistical models. Proceedings of AISTAT,2010.
    [81]Kronmal R A, Peterson Jr A V. On the alias method for generating random variables from a discrete distribu-tion. The American Statistician,1979..
    [82]Haghighi A, Blitzer J, DeNero J, et al. Better word alignments with supervised ITG models. Proceedings of Proc. ACL-IJCNLP,2009.
    [83]Mermer C, Saraclar M. Bayesian Word Alignment for Statistical Machine Translation. Proceedings of Proc. ACL,2011.
    [84]Och F J, Ney H. Giza++:Training of statistical translation models,2000.
    [85]Mikolov T, Yih W t, Zweig G. Linguistic regularities in continuous space word representations. Proceedings of Proc. NAACL-HLT,2013.
    [86]Tillmann C, Ney H. Word reordering and a dynamic programming beam search algorithm for statistical machine translation. Computational linguistics,2003,29(1):97-133.
    [87]Collins M, Koehn P, Kucerova I. Clause restructuring for statistical machine translation. Proceedings of Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Compu-tational Linguistics,2005.531-540.
    [88]Wang C, Collins M, Koehn P. Chinese Syntactic Reordering for Statistical Machine Translation. Proceedings of EMNLP-CoNLL,2007.737-745.
    [89]Xu P, Kang J, Ringgaard M, et al. Using a dependency parser to improve SMT for subject-object-verb lan-guages. Proceedings of Proceedings of human language technologies:The 2009 annual conference of the North American chapter of the association for computational linguistics. Association for Computational Lin-guistics,2009.245-253.
    [90]Li C H, Li M, Zhang D, et al. A probabilistic approach to syntax-based reordering for statistical machine translation. Proceedings of ANNUAL MEETING-ASSOCIATION FOR COMPUTATIONAL LINGUIS-TICS, volume 45,2007.720.
    [91]Genzel D. Automatically Learning Source-side Reordering Rules for Large Scale Machine Translation. Pro-ceedings of Proceedings of the 23rd International Conference on Computational Linguistics, Stroudsburg, PA, USA:Association for Computational Linguistics,2010.376-384.
    [92]Bingham E, Mannila H. Random projection in dimensionality reduction:applications to image and text data. Proceedings of Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM,2001.245-250.
    [93]Yang N, Liu S, Li M, et al. Word Alignment Modeling with Context Dependent Deep Neural Network. Proceedings of Proc. ACL,2013.
    [94]Collins M, Roark B. Incremental parsing with the perceptron algorithm. Proceedings of Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguis-tics,2004.111.
    [95]Koehn P. Statistical Significance Tests for Machine Translation Evaluation. Proceedings of EMNLP,2004. 388-395.
    [96]Bridle J. Neurocomputing:Algorithms, Architectures and Applications, chapter Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition,1990.
    [97]LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proc. the IEEE, 1998..
    [98]Ayan N F, Dorr B J, Monz C. Neuralign:Combining word alignments using neural networks. Proceedings of Proc. HLT-EMNLP,2005.
    [99]Turian J, Ratinov L, Bengio Y. Word representations:a simple and general method for semi-supervised learning.2010..
    [100]DeNero J, Macherey K. Model-based aligner combination using dual decomposition. Proceedings of Proc. ACL,2011.
    [101]Kavukcuoglu K, Sermanet P, Boureau Y L, et al. Learning convolutional feature hierarchies for visual recog-nition. NIPS,2010..
    [102]Wu D. A polynomial-time algorithm for statistical machine translation. Proceedings of Proceedings of the 34th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 1996.152-158.
    [103]Feng Y, Cohn T. A Markov Model of Machine Translation using Non-parametric Bayesian Inference. Pro-ceedings of Proc. ACL,2013.
    [104]Zhao S, Gildea D. A fast fertility hidden Markov model for word alignment using MCMC. Proceedings of Proc. EMNLP,2010.
    [105]Vaswani A, Huang L, Chiang D. Smaller alignment models for better translations:unsupervised word align-ment with the 10-norm. Proceedings of Proc. ACL.
    [106]Socher R, Huval B, Manning C D, et al. Semantic compositionality through recursive matrix-vector spaces. Proceedings of Proc. EMNLP-CoNNL.
    [107]Klementiev A, Titov I, Bhattarai B. Inducing Crosslingual Distributed Representations of Words. Proceedings of Proc.COLING,2012.
    [108]Marc'Aurelio Ranzato Y, Boureau L, LeCun Y. Sparse feature learning for deep belief networks. NIPS, 2007..
    [109]Teh Y W. A hierarchical Bayesian language model based on Pitman-Yor processes. Proceedings of Proc. COLING-ACL.
    [110]Seide F, Li G, Yu D. Conversational speech transcription using context-dependent deep neural networks. Proceedings of Proc. Interspeech,2011.
    [111]Liu S, Li C H, Zhou M. Discriminative pruning for discriminative ITG alignment. Proceedings of Proc. ACL, 2010.
    [112]Lee H, Battle A, Raina R, et al. Efficient sparse coding algorithms. NIPS,2007..
    [113]Och F J, Ney H. Giza++:Training of statistical translation models,2000.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700