用户名: 密码: 验证码:
基于MRMR的贝叶斯网络结构学习算法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
贝叶斯网络是一种概率图模型,能够高效表示随机变量之间复杂的独立依赖关系;即使在数据不完整的情况下,仍然具备高效的推理能力,因此越来越广泛的用于决策、诊断和复杂系统的控制等领域。如何从原始数据中学习到贝叶斯网络是其解决问题的前提。贝叶斯网络由结构和参数组成,其中结构学习是核心。本文研究了贝叶斯网络的理论知识和学习贝叶斯网络的相关算法,分别在完整数据集和缺失数据集下,结合最大相关和最小冗余特征选择技术,重点研究贝叶斯网络结构学习算法。其研究内容主要体现在以下几个方面:
     1)针对完整数据集,改进了基于节点次序的最大相关和最小冗余贪婪贝叶斯结构学习算法(OMRMRG),该算法引入了最大相关和最小冗余特征选择技术,并采用局部贝叶斯增量评分函数,在有限的数据集上提高了算法的精度和准确性。但由于是随机产生初始节点的次序,因此增大了结果的不确定性。本文提出了一种生成优化的节点初始次序的方法,在得到基本有序的节点初始次序后,再结合近邻交换算子进行迭代搜索,能够在较短的时间内得到更加正确的贝叶斯网络结构。实验结果表明了该方法的有效性。
     2)针对缺失数据集,对于BN-GS算法中的随机初始化缺失数据和随机生成节点次序带来的不确定性,利用节点次序预排序算法产生初始次序;对于每个缺值的节点变量,使用其在整个数据样本中出现次数最多的那个取值作为初始化,加快了收敛的速度,提高了为边确定方向时的正确率。基于初始化后的完整数据,应用克鲁斯卡尔算法建立最大权重生成树,并按照上述的节点次序确定生成树中边的方向。在使用吉布斯算法迭代修正数据的过程中,赋予了局部参数一个更准确的修正幅度。改进后的BN-GS算法能够在较短的时间内收敛到平稳分布,得到了较优的结构。实验结果表明了该算法的有效性和正确性。
     3)研究了基于评分搜索的方法和基于约束的方法,并通过实验进行了比较与分析。与基于评分的方法相比,基于约束的方法更为直观,且速度也比较快,有时得到的箭头可以理解为因果关系;但是独立性检验没有评分函数准确,而且一旦出错,就会对后面的计算产生很大影响。
Bayesian networks(BN) are probabilistic graphical models that can efficientlyrepresent complex independences and dependences relationships among random variables.It has still the ability of efficient reasoning even without complete data. Now BN arewidely applied for the decision aid, the diagnosis, the control of complex system, etc. Howto learn BN from native data is a precondition of solving the problem. A BN is a directedacyclic graph with a probability table for each node, and structure learning is the core ofthe BN. This thesis discuss theoretical knowledge and learning algorithm related to BN.Based on the maximum relevance and minimum redundancy feature selection techniques,it focuses on the structure learning algorithm of BN for complete data sets and missingdata sets. Our researches are concluded as follows:
     1) In view of complete data sets, we improved the Ordering-based Max-Relevanceand Min-Redundancy Greedy algorithm, which introduces the maximum relevance andminimum redundancy feature selection techniques and uses local Bayesian increment scorefunction which improves the precision and accuracy of the algorithm on limited data sets.Because the initial node order is randomly generated, it increases the uncertainty of theresults greatly. In this thesis, a new method, which can find a basic orderly order, isproposed. After getting a basic order, it uses neighbor-swap operator to iteratively search abetter result. Experimental results show that the new approach is effective.
     2) In view of missing data sets, the BN-GS(Bayesian Network&Gibbs Sampling)algorithm will bring larger uncertainty because of initializing the datasets and producingnode’s order randomly. So in this thesis, a pre-sorted algorithm, which is proposed before,is used to produce initial order of nodes, and the missing value of one node is initialized bythe value which its number is the highest in the entire datasets. This method speeds up theconverge process and increases the accuracy when orientation for the edges. Then using Kruskal algorithm, a maximum weight spanning tree is constructed from complete datasets,and based on initial order of node, the orientation of edges is adjusted. In the process ofmodification of local parameters by Gibbs sampling algorithm, it is assign a more accuratemodification value. The improved BN-GS algorithm can converge to the stationarydistribution in a shorter time. Experimental results show that the new approach is correctand effective.
     3) In this thesis, the method based on search&score is compared and analyzed withthe method based on constraint from the experiments. Compared to the method based onsearch&score, the method based constraint is more direct, faster in terms of speed, andsometimes the arrows that obtained from data can be interpreted as a causal relationship.The drawback of the method based on constraint is that the independence test is worse thanthe method based on search&score, and if there is error, the later calculations is greatlyaffected.
引文
[1] Pang-Ning Tan, Michael Steinbach著,范明等译.数据挖掘导论[M].北京:人民邮电出版社,2006.
    [2] http://www.ce.cn/xwzx/kj/201102/21/t20110221_22231552.shtml
    [3]刘峰.贝叶斯网络结构学习算法研究[D].北京邮电大学博士学位学术论文,2008.
    [4]张燕平,张铃.机器学习理论与算法[M].北京:科学出版社,2010.
    [5] Robbins H., Monro S. A. Stochastic Approximation Method[J]. Annals ofMathematical Statistics,1951,22:400-407.
    [6]张连文,郭海鹏.贝叶斯网引论[M].北京:科学出版社,2006.
    [7] Herskovits E. A Bayesian method for induction of probabilistic networks from data[J].Machine Learning,1992,9:309-347.
    [8] Lam W, Bacchus F. Learning Bayesian belief networks: An approach based on theMDL principle[J]. Computational Intelligence,1994,10(3):269-293.
    [9] Suzuki J A. Learning Bayeisan belief networks based on the MDL principle: Anefficient algorithm using the branch and bound technique[C]. Proceedings of theInternational Conference on Machine Learning, Bally,1996,463-470.
    [10]Heckerman D., Geiger D., Chickering D. M. Learning Bayesian networks: Thecombination of knowledge and statistical data[J]. Machine Learning,1995,20(3):197-243.
    [11]Friedman N. The Bayesian structural EM algorithm[C]. Proceedings of the FourteenthConference on Uncertainty in Artificial Intelligence, San Francisco: MorganKaufmann Publishers,1998:129-138.
    [12]James W. M. Stochastic Algorithms for Learning with Incomplete Data: An applicationto Bayesian Networks[D]. A dissertation for the degree of Doctor of Philosophy atGeorge Mason University,1999.
    [13]Chickering D M. Learning equivalence classes of Bayesian-network structures[J]. TheJournal of Machine Learning Research,2002,2:445-498.
    [14]Cheng J, Greiner R, Kelly J. Learning Bayesian networks from data: An efficientapproach based on information-theory [J]. Artificial Intelligence,2002,137(1):43-90.
    [15]Teyssier M.,Koller D. Ordering-based Search: A Simple and Effective Algorithm forLearning Bayesian Networks[C]. In Proceedings of the21st Conference onUncertainty in Artificial Intelligence. Morgan Kaufmann,San Francisco, CA, USA,2005,584-590.
    [16]I.Tsamardinos,L.E.Brown, and C. Aliferis. The max-min hill-climbing Bayesiannetwork structure learning algorithm[J]. Machine Learning,65(1):31-78,2006.
    [17]Yehezkel R, Lerner B. Bayesian network structure learning by recursive autonomyidentification[J]. The Journal of Machine Learning Research,2009,10:1527-1570.
    [18]de Campos C P, Ji Q. Efficient structure learning of Bayesian networks usingconstraints[J]. Journal of Machine Learning Research,2011,12(3):663-689.
    [19]Zgurovskii M Z, Bidyuk P I, Terent'ev A N. Methods of constructing Bayesiannetworks based on scoring functions. Cybernetics and System Analysis,2008,44(2):219-224
    [20]K. Kojima, E. Perrier, S. Imoto, S. Miyano. Optimal search on clustered structuralconstraint for learning Bayesian network structure[J]. Journal of Machine LearningResearch,11:285–310,2010.
    [21]T. Silander, P. Myllymaki. A simple approach for finding the globally optimalBayesian network structure[C]. In Proceedings of the22nd Conference on Uncertaintyin Artificial Intelligence, UAI’06, pages445–452, Arlington, Virginia,2006.
    [22]S. Ott, S. Miyano. Finding optimal gene networks using biological constraints[J].Genome Informatics,14:124–133,2003.
    [23]E. Perrier, S. Imoto, S. Miyano. Finding optimal Bayesian network given asuper-structure[J]. Journal of Machine Learning Research,9:2251–2286,2008.
    [24]Wei W., Hao W., Bo Y., etc. A Bayesian Network Based Knowledge EngineeringFramework for IT Service Management[J]. IEEE Transactions on services computing.2011,6(1):76-88.
    [25] Alam F I, Gondra I. A Bayesian network-based tunable image segmentation algorithmfor object recognition[C], IEEE International Symposium on Signal Processing andInformation Technology,2011:11-16.
    [26]程环环.基于贝叶斯网络的图像内容表述与分类[D].国防科技大学博士学位学术论文,2011.
    [27]Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers[J]. MachineLearning,1997,29(2-3):131-161.
    [28]朱允刚.贝叶斯网学习中若干问题研究及其在信息融合中的应用[D].吉林大学博士学位学术论文,2012.
    [29]Liu C L, Jou E, Lee C H. Analysis and prediction of trajectories using Bayesiannetwork[C]. IEEE Sixth International Conference on Natural Computation (ICNC),,2010,7:3808-3812.
    [30]Estabragh Z S, Kashani M M R, Moghaddam F J, et al. Bayesian network model fordiagnosis of Social Anxiety Disorder[C]. IEEE International Conference onBioinformatics and Biomedicine Workshops (BIBMW),2011:639-640.
    [31]Miguel A, Ortega A, Buera L, et al. Bayesian networks for discrete observationdistributions in speech recognition[J]. IEEE Transactions on Audio, Speech, andLanguage Processing,2011,19(6):1476-1489.
    [32]张勤. DUCG:一种新的动态不确定因果知识的表达和推理方法(I):离散、静态、证据确定和有向无环图情况[J].计算机学报,2010,33(4):625-651.
    [33]Nazerfard E., Cook D.J. Bayesian Networks Structure Learning for Activity Predictionin Smart Homes[C].20128th International Conference on Intelligent Environments(IE), page(s):50-56,2012.
    [34]曹卫东.基于改进贝叶斯网络结构学习的航班延误波及分析[D].天津大学博士学位学术论文,2009.
    [35]司冠南,任宇涵,许静,杨巨峰.基于贝叶斯网络的网构软件可信性评估模型[J].计算机研究与发展,2012,49(5):1028-1038.
    [36]李云春,秦先龙.面向分布式应用管理的混合故障诊断模型[J].计算机研究与发展.2010,47(3):455-462.
    [37]茆诗松,程依明,濮晓龙.概率论与数理统计教程[M].北京:高等教育出版社,2008.
    [38]Heckerman D. A tutorial on Learning Bayesian networks[R]. Technical reportMSR-TR-95-06. Redmond: Microsoft Research,1995.
    [39]De Campos L M. A scoring function for learning Bayesian networks based on mutualinformation and conditional independence tests[J]. The Journal of Machine LearningResearch,2006,7:2149-2187.
    [40]Roos T, Silander T, Kontkanen P, et al. Bayesian network structure learning usingfactorized NML universal models[C]. IEEE Information Theory and ApplicationsWorkshop,2008:272-276.
    [41]Chickering D M. Learning Bayesian neworks is NP-hard[R]. Technical reportMSR-TR-94-17. Redmond: Microsoft Research,1994.
    [42]D. M. Chickering, C. Meek, D. Heckerman. Large-sample learning of Bayesiannetwork is np-hard[C]. In Proceedings of the19th conference on Uncertainty inArtificial Intelligence UAI’03, pages124–13, San Francisco, CA,2003. MorganKaufmann.
    [43]Cao W, Fang X. An improved method for Bayesian network structure learning[C].Sixth International Conference on Natural Computation (ICNC), IEEE,2010,6:3133-3137.
    [44]dos Santos E B, Hruschka E R, Ebecken N F F. Evolutionary Algorithm UsingRandom Multi-point Crossover Operator for Learning Bayesian NetworkStructures[C]. Machine Learning and Applications (ICMLA),2010Ninth InternationalConference on. IEEE,2010:430-435.
    [45]王双成.贝叶斯网络学习、推理与应用[M].上海:立信会计出版社,2010.
    [46]Wang F, Zhu W. A Bayesian Network Structure Learning Method Based on AntColony Algorithm[C]. International Conference on Management and Service Science.2010:1-4.
    [47]Wu Y, McCall J, Corne D. Two novel Ant Colony Optimization approaches forBayesian network structure learning[C].2010IEEE Congress on EvolutionaryComputation (CEC),2010:1-7.
    [48]Moore A,Wong WK.Optimal Reinsertion: A new search operator for accelerated andmore accurate Bayesian Network structure learning[C]. ICML,2003.
    [49]dos Santos E B, Hruschka E R, Ebecken N F F. A distance-based mutation operator forlearning bayesian network structures using evolutionary algorithms[C]. IEEECongress on Evolutionary Computation,2010:1-8.
    [50]dos Santos E B, Ebecken N F F, Hruschka E R. Learning Bayesian Network structuresusing Multiple Offspring Sampling[C]. Intelligent Systems Design and Applications(ISDA),201111th International Conference on. IEEE,2011:362-367.
    [51]M. Koivisto and K. Sood. Exact Bayesian Structure Discovery in BayesianNetworks[J]. Journal of Machine Learning Research,5:549–573,2004.
    [52]A. P. Singh and A. W. Moore. Finding optimal Bayesian networks by dynamicprogramming[R]. Technical report, Carnegie Mellon University,2005.CMU-CALD-05-106.
    [53]M. Koivisto. Advances in exact Bayesian structure discovery in Bayesian networks[C].In Proceedings of the22nd Conference on Uncertainty in Artificial Intelligence,UAI’06, pages241–248. AUAI Press,2006.
    [54]P. Parviainen and M. Koivisto. Exact structure discovery in Bayesian networks withless space[C]. In Proceedings of the25th Conference on Uncertainty in ArtificialIntelligence, UAI’09, pages436–443, Arlington, Virginia, United States,2009.
    [55]T. Jaakkola, D. Sontag, A. Globerson, and M. Meila. Learning Bayesian NetworkStructure using LP Relaxations[C]. In Proceedings of the13th InternationalConference on Artificial Intelligence and Statistics, AISTATS’10, pages358–365,2010.
    [56]Elidan G, Gould S. Learning bounded treewidth bayesian networks[J]. Journal ofMachine Learning Research,2008,9:2699-2731.
    [57]Tamada Y, Imoto S, Miyano S. Parallel algorithm for learning optimal Bayesiannetwork structure[J]. Journal of Machine Learning Research,2011,12:2437-2459.
    [58]Leray P, Francois O. Bayesian network structural learning and incomplete data[J]. InAKRR'05International and Interdisciplinary Conference on Adaptive KnowledgeRepresentation and Reasoning, Finland,2005.
    [59]Borchani H, Amor N B, Mellouli K. Learning bayesian network equivalence classesfrom incomplete data[C]. Discovery Science. Springer Berlin Heidelberg,2006:291-295.
    [60]Ramoni M., Sebastiani P. Robust learning with missing data[J]. Machine Learning.2001,45(2):147-170.
    [61]Jansen R, Yu H, Greenbaum D, et al. A Bayesian networks approach for predictingprotein-protein interactions from genomic data[J]. Science,2003,302(5644):449-453.
    [62]Friedman N. Inferring cellular networks using probabilistic graphical models[J].Science,2004,303(5659):799-805.
    [63]Zou C, Feng J. Granger causality vs. dynamic Bayesian network inference: acomparative study[J]. BMC bioinformatics,2009,10(1):122.
    [64]Friedman, N. Learning belief networks in the presence of missing values and hiddenvariables[C]. In:14th International Conference on Machine Learning, pp.125-133,1997.
    [65]Daly R., Shen Q., Stuart A. Learning Bayesian networks: approaches and issues[R].The Knowledge Engineering Review.26(2):99-157, Cambridge University Press,2011.
    [66]Chickering D M., Optimal Structure Identification With Greedy Search[J]. Journal ofMachine Learning Research,3:507-554,2002.
    [67]Larry W. Bayesian Model Selection and Model Averaging[J]. Journal of MathematicalPsychology.2000,44:92-107.
    [68]Jensen F V. Bayesian networks and decision graphs[M]. Springer, New York,2001.
    [69]Peng H, Long F, Ding C. Feature selection based on mutual information criteria ofmax-dependency, max-relevance, and min-redundancy[J]. Pattern Analysis andMachine Intelligence, IEEE Transactions on,2005,27(8):1226-1238.
    [70]Liu F, Zhu Q. Max-relevance and min-redundancy greedy Bayesian network learningon high dimensional data[C]. IEEE Third International Conference on NaturalComputation,2007,1:217-221.
    [71]Tanner M A, Wong W H. The calculation of posterior distributions by dataaugmentation[J]. Journal of the American statistical Association,1987,82(398):528-540.
    [72]王双成,苑森淼.具有丢失数据的贝叶斯网络结构学习研究[J].软件学报,2004,15(7):1042-1048.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700