用户名: 密码: 验证码:
基于信号处理理论和方法的基因预测研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
基因是遗传的基本单位,是携带遗传信息的DNA片断,而非基因部分不编码蛋白质,因此,从DNA序列中预测出基因区一直是生物信息学的重要研究内容。本文运用信号处理的理论和方法,包括变换域方法、数字滤波器、时频分析、统计学习和智能算法等来研究基因预测。
     首先分析了基因预测滤波器的原理,提出编码区序列的长度和周期性的强弱是影响预测结果的重要因素。根据基因编码区的周期性质,设计了具有窄带选通特性的FIR数字滤波器和自适应滤波器。利用已标注的基因数据进行计算,获得了时域外显子位置的预测曲线,说明所设计的滤波器是有效的,并且能够提高基因预测的准确性。
     其次将基因预测的滤波器与傅立叶变换相结合,提出了一种用于基因识别的改进傅立叶方法。该方法能放大周期3信号,滤除背景噪声,不受现有傅立叶方法对于序列长度的限制。实验表明改进的傅立叶方法提高了预测精度。同时,给出一种加窗傅立叶变换的方法,可以识别出DNA序列中的编码区和非编码区。
     然后将基因编码区的隐马尔科夫模型与前向算法相结合,实现了对外显子的识别。对已标注的DNA序列进行预测,该算法既有效,又降低了计算量。同时,将支持向量机用于基因分类,实验表明该方法不仅提高了预测精度,而且降低了训练所需的数据。
     最后分别研究了四种特征量和三种判别方法对算法预测精度的影响,在此基础上,实现了一种基于多种特征量的基因识别算法,实验结果表明文中给出的算法进一步弥补了傅立叶方法的不足,对于较短的基因序列,其预测精度高于现有的基因识别算法。
Gene is the basic unit of heredity and is DNA segment with genetic information, but non-gene can not code protein, so finding genes in DNA sequences has been an important problem in bioinformatics for a long time. In this dissertation, theory and methods of signal processing including transform domain, digital filter, time-frequency analysis, statistical learning and intelligent algorithms are applied to identify genes.
     Firstly, theory of filter for gene prediction is analyzed, therefore two important elements are proposed: length and weak/strong periodicity of protein coding regions. According to the periodicity of the coding regions, the FIR filter and the adaptive filters with narrow pass-band are developed. The predicted locations of the exons are achieved by calculating the annotated gene sequence. The experimental results indicate that the designed filters are valid and can improve accuracy of gene identification.
     Secondly, an improved Fourier transform approach is proposed by integrating the gene prediction filter with the Fourier transform. This algorithm can magnify period-3 signals, remove the background noises, and is not restricted by the length of the predicted sequences unlike the existing Fourier methods. The experimental results show that the improved Fourier method can promote predictive accuracy. At the same time, the Fourier method based on sliding window is applied to identify the coding and noncoding regions in DNA sequences.
     Thirdly, the forward algorithm integrated with the Hidden Markov Model of coding regions is applied to predict exons in genes. By identifying the annotated gene sequence the designed algorithm is valid and reduces the computational complication. At the same time, the algorithm for gene classification based on support vector machine is schemed. The experimental results show that the proposed method can not only improve accuracy, but also reduce training data.
     At last, four features and three discriminate analysis methods are studied for improving predictive accuracy, and the gene identification algorithm based on multiple features is proposed. The experimental results indicate that the developed algorithm can improve the Fourier methods and has better accuracy than the existing gene prediction method for short DNA sequences.
引文
[1]Watson J D,Crick F H C.A structure for DNA.Nature,1953,171(4):737.
    [2]赵亚华.分子生物学教程.北京:科学出版社,2004:1-20.
    [3]张阳德 生物信息学.北京:科学出版社,2004:1-12.
    [4]Searls D B.Bioinformatics tools for whole genomes.Annu Rev Genomics Hum Genet,2000,1:251-279.
    [5]陈玲玲.原核与真核生物蛋白质编码区识别及基因组分析:(博士学位论文).天津:天津大学,2004:3-5.
    [6]Pearson W R,Miller W.Dynamic programming algorithms for biological sequence comparison.Methods Enzymol.,1992,210:575-601.
    [7]Altschul S,Madden T et al.Gapped BLAST and PSI-BLAST:a new generation of protein database search programs.Nucleic Acids Res.,1997,25:3389-3402.
    [8]Lipman D J,Pearson W R.Rapid and sensitive protein similarity searches.Science,1985,227:1435-1441.
    [9]Smith T F,Waterman M S.Identification of common molecular subsequences.J Mol Biol.,1981,147:195-197.
    [10]Thompson J D,Higgins D G,Gibson T J.CLUSTAL W:improving the sensitivity of progressive multiple sequence alignment through sequence weighting,position-specific gap penalties and weight matrix choice.Nucleic Acids Res.,1994,22:4673-4680.
    [11]Gibbs W W.The unseen Genome:Gems among junk.Scientific American,2003,(11):48-53.
    [12]Eddy S R.Computational genomics of Noncoding RNA genes.Cell,2002,109(4):137-140.
    [13]Eddy S R.Noncoding RNA genes and the modem RNA world.Nature Review,GENETICS,2001,2(12):919-929.
    [14]Storz G.An expanding universe of noncoding RNAs.Science,2002,296(5):1260-1263.
    [15]Gibbs W W.The unseen Genome:Beyond DNA.Scientific American,2003,(12):108-113.
    [16]欧竑宇.原核生物基因识别研究和基因组进化分析:(博士学位论文).天津:天津大学,2003:3-6.
    [17]Stein L D,Cartinhour S,Thierry-Mieg D et al.JADE:an approach for interconnecting bioinformaties databases.Gene,1998,209:GC39-GC43.
    [18]Stein L.Creating a bioinformatics nation.Nature,2002,417:119-120.
    [19]Trifonov E N,Sussman J L.The pitch of chromatin DNA is reflected its nucleotide sequence.Proc Natl Acad Sci USA,1980:3816-3820.
    [20]Trifonov E N.3-,10.5-,200- and 400-base periodicities in genome sequences.Physical A,1998,249:511-516.
    [21]Peng C K,Buldyrev S V,Goldberger A L et al.Long-range correlations in nucleotide sequences.Nature,1992,356(3):168-170.
    [22]Voss R F.Evolution of long-range fractal correlations and l/f noise in DNA base sequence.Physical Review Letters,1992,68(25):3805-3808.
    [23]卢欣,陈惠民,李衍达.细菌DNA序列中的长程相关性.清华大学学报(自然科学版),1999,39(7):98-102.
    [24]陈晓燕,鲍伦军,莫金垣.连续小波变换法分析核酸序列的长程相关性.中山大学学报(自然科学版),2003,42(3):111-113.
    [25]王明怡,吴平,王德林.基于相关性分析的基因选择算法.浙江大学学报(工学版),2004,38(10):1289-1292.
    [26]Mathe C,Sagot M F,Schiex T et al.Current methods of gene prediction,their strengths and weaknesses.Nucleic Acids Res.,2002,30:4103-4117.
    [27]沈世镒.生物序列突变与比对的结构分析.北京:科学出版社,2004:1-30.
    [28]唐玉荣.生物信息学中的序列比对算法研究:(博士学位论文).北京:中国农业大学大学,2004:8-9.
    [29]Frishman D,Mironov A,Mewes H Wet al.Combining diverse evidence for gene recognition in completely sequenced bacterial genomes.Nucleic Acids Res.,1998,26:2941-2947.
    [30]Badger J H,Olsen G J.CRITICA:coding region identification tool invoking comparative analysis.Mol Biol Evol.,1999,16:512-524.
    [31]Koonin E V,Tatusov R L,Galperin M Y.Beyond complete genomes:from sequence to structure and function.Curr Opin Struct Biol.,1998,8:355-363.
    [32]Mushegian A R,Koonin E V.A minimal gene set for cellular life derived by comparison of complete bacterial genomes.Proc Natl Acad Sci USA,1996,93:10268-10273.
    [33]Tatusov R L,Koonin E V,Lipman D J.A genomie perspective on protein families.Science,1997,278:631-637.
    [34]Xuan Z,Wang J,Zhang M Q.Computational comparison of two mouse draft genomes and the human golden path.Genome Biol.,2003,4(1):R1.1-R1.10.
    [35]Ciiften P,Sudarsanam P,Desikan A et al.Finding Functional Features in Saccharomyces Genomes by Phylogenetie Footprinting.Science,2003,301:71-76.
    [36]DeRisi J L,Iyer V R,Brown P O.Exploring the metabolic and genetic control ofgene expression on a genomie scale.Science,1997,278:680-686.
    [37] ldeker T, Thorsson V, Ranish J A et al. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science, 2001, 292: 929-934.
    [38] Grigoriev A. A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae. Nucleic Acids Res., 2001, 29:3513-3519.
    [39] Pilpel Y, Sudarsanam P, Church G M. Identifying regulatory networks by combinatorial analysis of promoter elements. Nat Genet.,2001,29: 153-159.
    [40] Yeger-Lotem E, Margalit H. Detection of regulatory circuits by integrating thecellular networks of protein-protein interactions and transcription regulation. Nucleic Acids Res., 2003, 31: 6053-6061.
    [41] Shepherd J C W. Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc Natl Acad Sci USA, 1981,78:1596-1600.
    [42] Fickett J W. Recognition of protein coding regions in DNA sequences. Nucleic Acids Res., 1982, 10: 5303-5318.
    
    [43] Staden R, McLachlan A D. Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucleic Acids Res., 1982, 10: 141-156.
    
    [44] Bibb M J, Findlay P R, Johnson M W. The relationship between base composition and codon usage in bacterial genes and its use for the simple and reliable identification of protein-coding sequences. Gene, 1984, 30:157-166.
    
    [45] Fichant G, Gautier C. Statistical method for predicting protein coding regions in nucleic acid sequences. Comput Appl Biosci., 1987, 3: 287-295.
    
    [46] Arques D G, Michel C J. Periodicities in coding and non-coding regions of genes. J Theor Biol., 1990,143:307-318.
    
    [47] Tsonis A A, Eisner J B, Tsonis P A. Periodicity in DNA coding sequences: Implications in gene evolution. J Theor Biol., 1991,151:323-331.
    
    [48] Silverman B D, Linsker R. A measure of DNA periodicity. J Theor Biol., 1986, 118: 295-300.
    [49] Tiwari S, Ramachandran S, Bhattacharya A et al. Prediction of probable genes by fourier analysis of genomic sequences. Comput Appl Biosci.,1997,13:263-270.
    [50] Coward E. Equivalence of two Fourier methods for biological sequences. Jour. of Math. Bio., 1997, 36:64-70.
    [51] Claverie J M, Bougueleret L. Heuristic informational analysis of sequences. Nucleic Acids Res.,1986,14:179-196.
    [52] Fickett J W, Tung C S. Assessment of protein coding measures. Nucleic Acids Res., 1992, 20: 6441-6450.
    [53]张春霆.生物信息学的现状与展望.中国科技研究与进展,2000,22(6):17-20.
    [54]李衍达.以信息系统的观点了解基因组.电子学报,2001,29(12A):1731-1734.
    [55]郝柏林.生物信息学.中国科学院院刊,2000,(4):260-264.
    [56]Zhang R,Zhang C T.Z curves,an intuitive tool for visualizing and analyzing the DNA sequences.J Biomol Struct Dyn.,1994,11:767-782.
    [57]Zhang C T,Zhang R,Ou H Y.The Z curve database:a graphic representation of genome sequence.Bioinformatices,2003,15(5):593-599.
    [58]Yan M,Lin Z S,Zhang C T.A new Fourier Transform approach for protein coding measure based on the format Z-curve.Bioinformatics,1998,14(8):685-690.
    [59]Wang Y H,Zhang C T.Recongnizing Shorter Coding Regions of Human Genes Based on the Statistics of Stop Codons.Biopolymers,2002,63:207-216.
    [60]张德礼,李衍达,季梁.用电子克隆新基因C17orf32和ZNF362对NCBI人类基因数据库模式参考序列5种错误类型的分析与纠正.遗传学报,2004,31(4):325-330.
    [61]袁远,季星来,孙之荣,李衍达.Isomap在基因表达谱数据聚类分析中的应用.清华大学学报(自然科学版),2004,44(9):1286-1289.
    [62]闻芳,李衍达.基因表达调控与选择性剪接机制研究.电子学报,2001,29(12A):1735-1739.
    [63]夏慧煜,周晴,李衍达.隐Markov模型在剪接位点识别中的应用.清华大学学报,2002,42(9):1214-1217.
    [64]闻芳,卢欣,孙之荣,李衍达.基于支持向量机(sVM)的剪接位点识别.生物物理学报,1999,15(4):733-739.
    [65]李萍,过涛,李衍达.基于小波分析的膜蛋白跨膜区段序列分析和预测.生物物理学报,2000,26(3):577-585.
    [66]卢欣,李衍达.基因调控过程的典型控制环节.自动化学报,2000,26(5):637-644.
    [67]Fickett J W.The Gene identification problem:An overview for developers.Comput Chem.,1996,20(1):103-119.
    [68]Anastassiou D.Frequency-domain analysis of biomoleeular sequences,bioinformaties,2000,16(12):1073-1081.
    [69]Anastassiou D.Genomic signal processing.IEEE Signal Processing Magazine,2001,18(4):8-20.
    [70]Yoon B J,Vaidyanathan P P.Digital filters for gene prediction applications.Proc.of 36th Asilomar Conference on Signals,Systems,and Computers,Monterey,CA,Nov.2002,1:306-310.
    [71]Vaidyanathan P P,Yoon B J.The role of signal-processing concepts in genomies and proteomics.Journal of the Franklin Institute.(invited paper),Special Issue on Genomics,2004,341:111-135.
    [72] Vaidyanathan P P. Genomics and proteomics: a signal processor's tour. IEEE circuits and systems magazine, 2004, 4(4):6-29.
    [73] Yoon B J, Vaidyanathan P P. Identification of CpG islands using a bank of IIR lowpass filters. Proc. of 11th Digital Signal Processing Workshop, Taos Ski Valley, New Mexico, Aug. 2004:315-319.
    [74] Yoon B J, Vaidyanathan P P. Computational identification and analysis of noncoding RNAs - Unearthing the buried treasures in the genome. IEEE Signal Processing Magazine, 2007, 24(1): 64-74.
    [75] Yoon B J. Effective annotation of noncoding RNA families using profile context-sensitive HMMs. 3rd International Symposium on Communications, Control and Signal Processing(ISCCSP 2008), 2008:1193-1198.
    [76] Cai X D, Wang X D. Stochastic modeling and simulation of gene networks - A review of the state-of-the-art research on stochastic simulations. IEEE Signal Processing Magazine, 2007, 20(3): 27-36.
    [77] Cai X D. Stochastic Modeling of Gene Expression and Parameter Estimation. Proc. IEEE SSP, 2007: 26-30.
    [78] Ambikairajah E, Epps J, Akhtar M. Gene and exon prediction using time domain algorithms. Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005, 1: 199-202.
    
    [79] Akhtar M, Epps J, Ambikairajah E. Time and frequency domain methods for gene and exon prediction in eukaryotes. IEEE ICASSP, 2007: 573-576.
    [80] Akhtar M, Epps i, Ambikairajah E. On DNA Numerical Representations for Period-3 Based Exon Prediction. IEEE International Workshop on Genomic Signal Processing and Statistics,2007:1-4.
    [81] Akhtar M, Ambikairajah E, Epps J. Optimizing period-3 methods for eukaryotic gene prediction. IEEE ICASSP, 2008: 621-624.
    [82] Bergen S W A, Antoniou A. Application of parametric window functions to the STDFT method for gene prediction. 2005 IEEE Pacific Rim Conference on Communications, Computers and signal Processing, 2005:324-327.
    [83] Berger J A, Mitra S K, Carli M, Neri A. New approaches to genome sequence analysis based on digital signal processing. Proc. GENSIPS 2002, Raleigh, North Carolina, USA, October 2002: 1-4.
    [84] Datta S, Asif A. A fast DFT based gene prediction algorithm for identification of protein coding regions. IEEE ICASSP, 2005, 5: 653-656.
    [85] Berger J A, Mitra S K, Astola J. Power spectrum analysis for DNA sequences. Proc. of ISSPA 2003, France, July 2003:29-32.
    [86] Afef E, Zied L, Noureddine E. Spectral Analysis of DNA Sequence- The Exon's Location Method. The International Conference on Digital Signal Processing, 2007:115-118.
    [87]Santo E,Dimitrova N.Improvement of Spectral Analysis as a Genomic Analysis Tool.IEEE International Workshop on Genomic Signal Processing and Statistics(GENSIPS 2007),June 2007:1-4.
    [88]Yu R J,Tan E C.Spectrogram Analysis of Genome Small Patterns Using Pseudo Smoothed Wigner-Ville Distribution.Fifth International Conference on Information,Communications and Signal Processing,2005:1044-1047.
    [89]孙啸,陆祖宏,谢建明.生物信息学基础.北京:清华大学出版社,2005:1-10.
    [90]顾万君,马建民,周童,孙啸,陆祖宏.不同结构的蛋白编码基因的密码子偏性研究.生物物理学报,2002,18(1):81-86.
    [91]Liu Z H,Jiao D,Sun X.Classifying Genomic Sequences by Sequence Feature Analysis.Genomics Proteomics Bioinformatics,2005,3(4):201-205.
    [92]Liu Z H,Liu H D,Li J R,Sun X,Jiao D.Base-Base Correlation:A Novel Sequence Feature and its Applications.The lst International Conference on Bioinformatics and Biomedical Engineering,2007,(1):376-379.
    [93]杨福生.数字信号处理技术用于生物分子序列的分析.中国医疗器械杂志,2002,(3):157-160.
    [94]王宏漫,欧宗瑛.关于核苷酸序列频谱分析方法的探讨.信号处理,2002,18(4):349-352.
    [95]王宏漫,欧宗瑛.一种新的DNA序列映射规则及其分析应用.信号处理,2002,18(2):133-136.
    [96]王宏漫,欧宗瑛.进化算法DNA序列比对中的应用.数据采集与处理,2002,17(4):463-466.
    [97]田元新,陈超,邹小勇等.外显子周期三行为特征的研究.化学学报,2005,263(13):1215-1219.
    [98]崔光照,曹祥红,王延峰,张勋才.生物信息学中的数字信号处理方法研究.科学技术与工程,2005,5(20):1494-1497.
    [99]王玉,饶妮妮,匡斌,袁祚涌.基于小波变换技术预测DNA序列的编码区.电子学报,2007,35(1):141-144.
    [100]王翼飞,史定华.生物信息学-智能化算法及其应用.北京:化学工业出版社,2006:50-70.
    [101]韩纪庆,张磊,郑铁然.语音信号处理.北京:清华大学出版社,2004:200-213.
    [102]Borodovsky M,Mclninch J.GenMark:Parallel gene recognition for both DNA strands.Computers chem.,1993,17:123-134.
    [103]Besemer J,Borodovsky M.Heuristic approach to deriving models for gene finding.Nucleic Acids Res.,1999,27:3911-3920.
    [104]Besemer J,Lomsadze A,Borodovsky M.GeneMarkS:a self-training method for prediction of gene starts in microbial genomes.Implications for finding sequence motifs in regulatory regions.Nucleic Acids Res.,2001,29:2607-2618.
    [105] Salzberg S L, Delcher A, Kasif S et al. Microbial gene identification using interpolated Markov models. Nucleic Acids Res., 1998, 26: 544-548.
    [106] Delcher A L, Harmon D, Kasif S et al. Improved microbial gene identification with GLIMMER. Nucleic Acids Res., 1999,27: 4636-4641.
    
    [107] Yada T,ToToki Y, Takagi T et al. A novel bacterial gene-finding system with improved accuracy in locating start codons. DNA Res., 2001,30:97-106.
    [108] Lukashin A V, Borodovsky M.GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res., 1998,26: 1107-1115.
    [109] Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol.,1997, 268:78-94.
    
    [110] Kulp D, Haussler D, Reese M G et al. A generalized hidden Markov model for the recognition of human genes in DNA. Proc Int Conf Intell Syst Mol Biol., 1996,4: 134-142.
    
    [111] Reese M G, Eeckman F H, Kulp D et al. Improved splice site detection in Genie. J Comput Biol., 1997,4:311-323.
    [112] Krogh A. Two methods for improving performance of an HMM and their application for gene finding. Proc Int Conf Intell Syst Mol Biol., 1997, 5: 179-186.
    
    [113] Henderson J, Salzberg S, Fasman K H. Finding genes in DNA with a Hidden Markov Model. J Comput Biol., 1997,4:127-141.
    
    [114] Yoon B J, Vaidyanathan P P. HMM with auxiliary memory: a new tool for modeling RNA secondary structures. Proc. 38th Asilomar Conference on Signals, Systems, and Computers, Monterey, CA,Nov. 2004,2:1651-1655.
    
    [115] Yoon B J, Vaidyanathan P P. Profile context-sensitive HMMs for probabilistic modeling of sequences with complex correlations. Proc. 31st IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, May 2006,3: III - III.
    
    [116] Yoon B J, Vaidyanathan P P. Context-sensitive hidden Markov models for modeling long-range dependencies in symbol sequences. IEEE Transactions on Signal Processing, Nov. 2006, 54: 4169-4184.
    [117] Yoon B J, Vaidyanathan P P. RNA secondary structure prediction using context-sensitive hidden Markov models. Proc. International Workshop on Biomedical Circuits and Systems (BioCAS), Singapore, Dec. 2004,7:1-4.
    [118] Yoon B J, Vaidyanathan P P. Modeling and identification of alternative folding in regulatory RNAs using context-sensitive HMMs. IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS), College Station, Texas, May 2006: 21-22.
    [119]Yoon B J,Vaidyanathan P P.Optimal alignment algorithm for context-sensitive hidden Markov models.Proc.30th IEEE International Conference on Acoustics,Speech,and Signal Processing (ICASSP),Philadelphia,Mar.2005,4:18-23.
    [120]Yoon B J,Vaidyanathan P P.Fast structural similarity search of noncoding RNAs based on matched filtering of stem patterns.Proc.4Ist Asilomar Conference on Signals,Systems,and Computers,Monterey,CA,Nov.2007:44-48.
    [121]Yoon B J,Vaidyanathan P P.Fast search of sequences with complex symbol correlations using profile context-sensitive HMMs and pre-screening filters.Proc.32nd IEEE International Conference on Acoustics,Speech,and Signal Processing(ICASSP),Honolulu,Hawaii,April 2007,1:345-348.
    [122]Yoon B J,Vaidyanathan P P.Structural alignment of RNAs using profile-csHMMs and its application to RNA homology search:Overview and new results.IEEE Transactions on Automatic Control(Joint Special Issue on Systems Biology with IEEE Transactions on Circuits and Systems:Part-Ⅰ),Jan.2008,53:10-25.
    [123]Yoon B J.A simple method for finding structurally similar RNAS using two-dimensional discrete convolution.IEEE International Workshop on Genomie Signal Processing and Statistics(GENSiPS 2008),2008:1-3.
    [124]Yang W Q,Qian M P,HUANG D W.Detection of Exons with Deletions and Insertions by Hidden Markov Models.生物化学与生物物理进展,2002,29(1):56-59.
    [125]李冬冬,王正志.广义隐Markov模型在基因识别中的应用.生物信息学,2004,2(1):18-21.
    [126]罗泽举,李艳会,宋丽红,朱思铭.基于隐马尔可夫模型的DNA序列识别.华南理工大学学报(自然科学版),2007,35(8):123-126.
    [127]罗亮,史晓红,许进.LVQ神经网络方法预测蛋白质结构中的二硫键.系统仿真学报,2007,19(9):2077-2079.
    [128](美)瓦普尼克著;张学工译.统计学习理论的本质.北京:清华大学出版社,2000:91-116.
    [129](美)Nello Cristianini,John Shawe-Taylor著;李国正,王猛,曾华军译.支持向量机导论.北京:电子工业出版社,2004:82-107.
    [130]蔡春,万潇楠,逯燕玲.基于支持向量机的DNA序列分类系统的设计与实现.中国农业大学学报,2005,10(2):58-64.
    [131]马银晓,姚敏.支持向量机在植物分类中的应用.科技通报,2007,23(3):404-407.
    [132]Mathe C,Sagot M F,Schiex T et al.Current methods of gene prediction,their strengths and weaknesses.Nucleic Acids Res.,2002,30:4103-4117.
    [133]Zhang M Q.Computational prediction of eukaryotic protein-coding genes.Nat Rev Genet.,2002,3:698-709.
    [134]郭锋彪.原核生物蛋白质编码区识别及基因组序列分析:(博士学位论文).天津:天津大学,2005:3-6.
    [135]Allen J E,Pertea M,Salzberg S L.Computational gene prediction using multiple sources of evidence.Genome Res.,2004,14:142-148.
    [136]夏慧煜.选择性剪切相关问题研究:(博士学位论文).北京:清华大学,2006:3-4.
    [137]程佩青.数字信号处理教程.北京:清华大学出版社,2001:359-368.
    [138]Widrow B,Mccool J M,Larimore M Get al.Stationary and nonstationary learning characteristics of the LMS adaptive filter.Proc.IEEE,1976:1151-1162.
    [139]Widrow B,Lehr M,Beaufays F et al.Learning algorithms for adaptive processing and control.IEEE International Conference on Neural Networks,1993,1:1-8.
    [140]Eleftheriou E,Falconer D D.Tracking properties and steady-state performance of RLS adaptive filter algorithms.IEEE Transactions on Signal Processing,1986,34(5):1097-1110
    [141]张贤达.现代信号处理.北京:清华大学出版社,2002:188-222.
    [142]Elefltaeriou E,Falconer D D.Steady-state behavior of RLS adaptive algorithms.IEEE International Conference on Acoustics,Speech,and Signal Processing,1985,10:1145-1148.
    [143]Kwong R H,Johnston E W.a variable step size LMS algorithm.IEEE transaction on signal processing,1992,40(7):1633-1642.
    [144]Cohen L.Time-Frequency Distributions-A Review.Proceedings of the IEEE,1989,7(7):941-981.
    [145]胡广书.现代信号处理教程.北京:清华大学出版社,2004:72-117.
    [146]史良,尉春艳,高琦.应用HMM和加权距离判别法的真核基因识别程序研究.中国生物医学工程学报,2004,24(1):74-79.
    [147]朱红梅,王家廒,赵燕南,杨泽红.延时HMM在基因剪接供体位点识别中的应用.计算机工程,2007,27(5):1-3.
    [148]Rabiner L R,Juang B H.An Introduction to Hidden Markov Models.IEEE Acoustics,Speech,and Signal Processing Society Magazine,1986,3(1):4-16.
    [149]Feder M,Oppenheim A V,Weinstein E.Maximum Likelihood Noise Cancellation Using the EM Algorithm.IEEE Transactions on Acoustics,Speech and Signal Processing,1989,37(2):204-216.
    [150]Vladimir N V.An Overview of Statistical Learning Theory.IEEE Transactions on Neural Networks.1999,10(5):988-999.
    [151]Ma J M,Nguyen M N,Pang G W L,Rajapakse J C.Gene Classification using Codon Usage and SVMs.Proceedings of the 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology,2005,CIBCB'05,2005:1-8.
    [152]Gupta R,Mittal A,Singh K,Bajpai P,Prakash S.A Time Series Approach for Identification of Exons and Introns.10th International Conference on Information Technology(ICIT 2007),2007:91-93.
    [153]Kung S Y.On Feature Selection for Genomie Signal Processing and Data Mining.2007 IEEE Workshop on Machine Learning for Signal Processing,Aug.2007:1-20.
    [154]Bu H L,Li G Z,Zeng X Q,Yang J Y,Yang M Q.Feature Selection and Partial Least Squares Based Dimension Reduction for Tumor Classification.Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering(BIBE 2007),Oct.2007:967-973.
    [155]Xia X L,Li K.A New Score Correlation Analysis Multi-class Support Vector Machine for M icroarray.International Joint Conference on Neural Networks(IJCNN 2007),2007:2610-2615.
    [156]周玉元,周铁军.DNA序列分类的Fisher判别法.湖南农业大学学报(自然科学版),2003,29(5):437-440.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700