用户名: 密码: 验证码:
基于信息论的基因调控网络分析与重构方法探索
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近年来,随着人类基因组计划的完成、DNA微阵列技术的出现和应用使得可以同时定量测定成千上万个基因在生物样本中的表达水平,从而为用数学计算的方法研究基因间复杂大规模的基因表达调控打下基础。一些研究者已经开始绘制控制整个活细胞基因表达的调控网络。这些基因调控网络是生命功能在基因表达层面的展现。基因调控网络的重构,就是利用大量基因表达的生物数据,结合一定的分析计算方法来构建遗传相互作用并模拟系统动态行为,洞察基因间相互依赖关系。反之,利用已建立的基因调控网络可以进一步指导生物实验。基因调控网络的研究是建立在分子生物学,非线性数学和信息科学的基础之上,已经成为后基因组时代研究的重要内容。
     基因调控网络的研究是从基因之间相互作用的角度揭示复杂的生命现象,是功能基因组学研究的重要内容,也是当前生物信息学的前沿。基因芯片技术在生物信息学中的应用为基因调控网络的研究提供大量可供研究与分析的基础数据。
     本文进行了基因调控网络方面的研究探索。首先回顾了生物信息学中用于描述基因调控网络的主要数学方法和模型。比如有向图和无向图模型、布尔网络模型、线性组合模型、加权矩阵模型、贝叶斯网络模型、微分方程模型和互信息关联网络模型。然后对基于信息论的互信息关联网络模型进行分析和重构的研究,并实现一个基于wxWidgets跨平台库和Boost库的互信息关联网络重构系统。
In recent years, as the Human Genome Project being completed, the application of DNA micro-array technology can make quantitative determination of thousands of genes in biological samples in the expression levels, so as to establish the foundation of doing research on the complicate and large scale of gene expression by using mathematical calculation. Some researchers had begun to draft and control the whole gene regulation of living cells. These gene regulatory networks are a display of the lives' function reconstruction in terms of genes expressing. Using a large amount of biological data by way for a large scale of genes expressing, we can combine with a certain amount of analyzing and calculating methods to construct the imitation system's dynamic behavior of the genes' interplay and observe their independent relationship. In contrast, using the established genes regulatory networks can help us further in biological experiment. Gene regulatory networks' research is based on molecule biology, nonlinear mathematics and information technology which can be considered the important content of post genome's researching.
     By researching of gene regulatory networks, we discovered the complicated life phenomenon in terms of the display between genes. It is also very important in functional gene research, and the advancing front of biological informatics. Genes chip technology's application in biological informatics provides a large amount of basic data for gene regulatory network to analyze and research.
     This thesis aims to do research in the aspect of gene regulatory networks. At first, we introduce some models which were applied in biological informatics, e.g. directed and undirected graphs model, Boolean network model, linear combination model, weighted matrix model, Bayesian network model, differential equations model and mutual information relevance network model. Then, another research is exploring on gene regulatory networks analysis and Reconstruction algorithms based on information theory, Final, we design and realize a mutual information regulatory network reconstruction system based on wxWidgets library and Boost library.
引文
[1]de Jong,H.2000.Modeling and simulation of genetic regulatory systems:A literature review.J.Comput.Biol 9(1):67-103.
    [2]Wyrick,J.J.and R.A.Young.Deciphering gene expression regulatory networks.Curr Opin Genet Dev 2002,12(2):130-6
    [3]邵峰晶,于忠清。数据挖掘原理与算法。2003年8月.
    [4]Shannon CE.A mathematical theory of communication.The Bell System Technical Journal.1948;27:623-656
    [5]P D' Haeseleer,X Wen,S Fuhrman,et al.Probabilistic modeling of mRNA expression levels during CNS development and injury[A].Pac Syrup Biocomput[C].Singapore:World Scientific Publishing,1999.41-52.
    [6]T Chen,H L He,G M Church.Modeling gene expression with differential equations[A].Pac Syrup Bioconput[C].Singapore:World Scientific Publishing,2000.418-429.
    [7]M Yeung,J Tegner,J Collins.Reverse engineering gene networks using singular value decomposition and robust regression[J].Proceedings of the National Academy of Sciences of United States of States of America,2002,99(9):6163-6168.
    [8]J Gouze,T Sari.A class of piecewise linear differential equations arising in biological models[J].Dynamical Systems an International Journal,2002,17(4):299-316.
    [9]H De Jong,J Gouze,C Hernandez,et al.Qualitative simulation of genetic regulatory networks using piecewise-linear models[J].Bulletin of Mathematical Biology,2004,66(2):301-340
    [10]S Kikuchi,D Tominaga,M Arita,et al.Dynamic modeling of genetic networks using genetic algorithm and S-system[J].Bioinformatics,2003,19(5):643-650.
    [11]Y Maki,D Tominaga,M Okamoto,et al.Development of a system for the inference of large scale genetic networks[A].Pac Symp Biocomput[C].Singapore:World Scientific pubishing,2001.446-458
    [12]E Sakamoto,H Iba.Inferring a system of differential equations for a gene regulatory network by using genetic programming[A].Proceedings of the 2001 Congress on Evolutionary Computation CEC2001[C].Bosten,US:IEEE Press,2001.720-726.
    [13]A J Butte,I S Kohane.Mutual information relevance networks:functional genomic clustering using pairwise entropy measurements[A].Pac Symp Biocomput[C].Singapore:World Scientific Publishing,2000,418-429.
    [14]J Quackenbush,Computational analysis of microarray analysis,Nature Reviews Genetics, 2001,2:418-427.
    [15]D.C.Weaver,C.T.Workman,and G.D.Stormo.Modeling regulatory networks with weight matrices.Pacific Symp.Biocomp.1999,4:112-123.
    [16]Quackenbush J.Computational genetics:computational analysis of microarray data.NatRevGenet,2001,2:418-427
    [17]Wessels L.,et al.A comparison of genetic network models.In:Facific symposium on Biocomputing,2001:230-245
    [18]Cooper,G.F.& Herskovits,E.,A Bayesian method for the induction of probabilistic networks from data.Machine Learning,1992,9:309-347
    [19]Bruno-Edouard Perrin.Gene networks inference using dynamic Bayesian networks.Bioinformatics,2003,1(1):1-10
    [20]Zhang,M.Q.Large scale gene expression data analysis:a new challenge to computational biologists.Genome Research,1999,9:681-688.
    [21]Liang S,Furhman S,Somogyi R.Reveal a general engineering algorithm for inference of genetic network architectures[j].Pacific Symp Biocomp,1998,3:18-29
    [22]Friedman N,Linial M,Nachman I.and Pe'er D.Using Bayesian networks to analyze expression data.RECOMB,2000:127-135.
    [23]Steuer R.,et al.Bioinformatics,2002;18:s231-s240.
    [24]Han,J.D.et al.Evidence for dynamically organized modularity in the yeast protein-protein in interaction network.Nature 430,88-93(2004).
    [25]Jeong,H.Tombor,B.Albert,R.,Oltval,Z.N.& Barabase,A.L.The large-scale organization of metabolic networks.Nature 407,651-654(2000).
    [26]MacLennan,I.C.Germinal center.Annu.Rev.Immunol.12,117-139(1994).
    [27]Friedman,N.Inferring cellular networks using probabilistic graphical models.Science 303,799-805(2004).
    [28]Steuer,R.,Kurths,J.,Daub,C.O.,Weise,J.& Selbig,J.The mutual information:Detecting and evaluating dependencies between variables.Bioinformatics 18 Suppl 2:S231-S240(2002).
    [29]Joe,H.Multivariate Models and Dependence Concepts(Chapman & Hall,Boca Raton,Florida,1997).
    [30]Barabasi,A.L.& Albert,R.Emergence of scaling in random networks.Science 286,509-512(1999).
    [31]Fernandez,P.C.et al.Genomic targets of the human c-Myc protein.Genes Dev.17,1115-1129(2003).
    [32]Brazma,A.and Vilo,J.(2000)Gene expression data analysis.FEBSLett.,480,17-24.
    [33]D'haeseleer,P.,Liang,S.and Somogyi,R.(2000)Genetic networkinference:from co-expression clustering to reverse engineering.Bioinformatics,16,707-726.
    [34]Kolmogorov,A.N.(1968)Logical basis for information theory and probability theory.IEEE Trans.Information Theor.,14,662-664.
    [35]Moon,Y.,Rajagopalan,B.and Lall,U.(1995)Estimation of mutual information using kernel density estimators.Phys.Rev.E,52,2318-2321.
    [36]Silverman,B.W.(1986)Density Estimation for Statistics and DataAnalysis.Chapman and Hall,London.
    [37]Schena,M.,Shalon,D.,Davis,R.W.and Brown,P.O.(1995)Quantitativemonitoring of gene expression patterns with a complementaryDNA microarray.Science,270,467-470.
    [38]Michaels,G.,Carr,D.,Askenazi,M.,Fuhrman,S.,Wen,X.and Somogyi,R.(1998)Cluster analysis and data visualization of largescale gene expression data.Pac.Symp.Biocomput.,3,42-53.
    [39]A Brazma,J Vilo,Gene expression data analysis,FEBS Letters,2000,480:17-24.
    [40]Margolin,A.A.,et al.,ARACNE:an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context.BMC Bioinformatics,2006.7(Suppll):p.S1-7.
    [41]Margolin,A.A.,et al.,Reverse engineering cellular networks.Nature Protocols,2006.1(2):p.663-672.(full description of ARACNE algorithm)
    [42]Basso K,Margolin AA,Stolovitzky G,Klein U,Dalla-Favera R,Califano A.Reverse engineering of regulatory networks in human B cells.Nat Genet 2005,37(4):382-390.
    [43]张涵,宋满根,陈国强,等.一种改进的多元回归估计基因调控网络的方法[J].上海交通大学学报,2005,39(2):270-274

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700