摘要
针对朴素贝叶斯分类器忽略属性间依赖关系造成分类准确性降低的问题,提出了基于贪婪选择算法的半朴素贝叶斯分类器分组改进算法.改进过程中依据不同参数的调整和属性选择技术衍生出3种分组方法,获得不同的改进方式,建立了贪婪选择半朴素贝叶斯分类器,实验采用UCI数据库中选取的数据进行分类.结果表明,改进的分类器具有良好的分类准确率.
It has been for decades for the improvement of Naive Bayesian classifier,and the analysis methods of the dependence among attributes tend to diversify.This paper proposed an improved grouping algorithm based on greedy selection algorithm.According to different attribute parameters and selection techniques,it has derived three kinds of grouping methods to obtain different improvements.It established the greedy choice semi-naive bayesian classifier,and made full use of the dependencies between attributes.The experimental results showed that the improved classifier has good classification accuracy by using UCI data and economic data.
引文
[1]黄春华,陈忠伟,李石君.贝叶斯决策树方法在招生数据挖掘中的应用[J].计算机技术与发展,2016(4):114-118.
[2]王辉,王双成,周颜军,等.基于广义朴素贝叶斯分类器的空值处理方法[J].东北师大学报(自然科学版),2004,36(1):34-38.
[3]PERNKOPF F,BILMES J A.Efficient heuristics for discrimi-naive structure learning of Bayesian network classifiers[J].Journal of Machine Learning Research,2010,11:2323-2360.
[4]赵亮,刘建辉,崔彩峰.互信息匹配的半朴素贝叶斯分类器[J].计算机工程与应用,2015(18):84-87.
[5]王辉,韩旭,王双成,等.连续属性朴素贝叶斯分类器的依赖扩展研究[J].东北师大学报(自然科学版),2012,44(2):41-45.
[6]YAGER-R R.An extension of the Na6ve Bayesian classifier[J].Information Science,2006,176:577-588.
[7]王双成,高瑞,杜瑞杰.具有超文结点时间序列贝叶斯网络集成回归模型[J].计算机学报,2017,40(12):2748-2761.
[8]JULIA M,FLORES J A,GAMEZ J M,et al.Domains of competence of the semi-naive Bayesian network classifiers[J].Information Sciences,2014,260(1):120-148.
[9]CHICKERING D M.Learning equivalence classes of Bayesian network structures[J].Journal of Machine Learning Research,2002,2(3):445-498.
[10]ADEDOKUN OA,BURGESS WD.Analysis of paired dichotomous data:agentle introduction to the McNemar test in SPSS[J].Journal of Multidisciplinary Evaluation,2012,8(17):125-131.
[11]王双成,高瑞,杜瑞杰.基于高斯Copula的约束贝叶斯网络分类器研究[J].计算机学报,2016,39(8):1612-1625.