用户名: 密码: 验证码:
基于位表的关联规则挖掘及关联分类研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着人们利用信息技术生产和搜集数据的能力大幅度提高,数据资料的规模急速膨胀。如何快速有效的从海量数据中发现隐藏的、预先未知的信息和知识显得尤为重要,数据挖掘是解决这一问题的有力工具。关联规则(Association Rules)获取是数据挖掘研究的一个重要领域,从某种意义上来讲,关联规则挖掘就是数据挖掘的本质。近年来相关的研究与应用一直占有重要的比例并得到了迅速发展。研究如何快速有效的从海量数据库中挖掘蕴含其中的关联规则,并将挖掘得到的关联规则合理利用,具有十分重要的理论和实际意义。本文在分析现有挖掘算法中存在问题的基础上,提出基于位表的完全频繁项集和事务间频繁闭项集的挖掘算法,并进一步研究关联规则在分类问题中的应用,利用其解决遥感影像分类问题。本文的研究工作可概括为如下三个方面的内容:
     1、研究事务内完全频繁项集的快速挖掘算法。现有的完全频繁项集挖掘算法多基于Apriori算法,称为Apriori类算法。其在生成候选集须逐个比较两个项集的前n-1项,并且在计算支持度需对全部或部分数据库进行逐条扫描,占用大量的计算时间和I/O操作,成为此类算法的主要瓶颈。针对以上问题,本文首先提出了位表(BitTable)数据结构及其相应的二进制操作。利用其对事务数据库进行压缩,同时通过二进制“与”、“或”操作快速计算候选项集的支持度,改善了低效率的数据库扫描操作;利用其对候选项集和频繁项集进行横向压缩,可直接生成候选项集,避免了逐项比较的复杂操作。该数据结构及其操作可以直接应用于现有的Apriori类算法中,有效地改善此类算法的效率问题。在位表数据结构的基础上,本文进一步提出了基于位表的关联规则挖掘算法BitTableFI。对常用数据集的仿真试验表明了该算法的有效性。
     2、研究事务间频繁闭项集及其快速挖掘算法。相对于事务内频繁项集,事务间频繁项集能够有效的揭示各属性在不同时刻的关联性,是事务内频繁项集的扩展。然而事务间频繁项集的数量随滑动时间窗口的增大而迅速增加,造成挖掘效率降低,利用闭项集来表示事物间频繁项集能够在不丢失信息的情况下有效的减少项集的数量。本文通过分析事务内频繁闭项集和事务间频繁闭项的内在关系,提出了一种利用事务内频繁闭项集生成事务间频繁闭项集的算法。算法采用分割和条件数据库技术,有效的避免了生成庞大的扩展事务数据库,利用扩展的位表结构压缩事务从而提高支持度的计算效率。此外,采用动态排序和哈希技术极大地减少了频繁闭项集的测试次数。该算法为挖掘事物间频繁闭项集提供了一种有效而快速的算法。
     3、研究模糊关联分类算法,并利用其解决遥感影像分类问题。关联分类将挖掘获取的频繁项集应用于解决分类问题,将关联规则的挖掘和应用问题紧密结合。将模糊方法引入到关联分类问题中,能够较好的解决规则的“尖锐边界“问题。然而,现有的模糊关联分类算法多采用固定模糊隶属度函数对连续型属性进行模糊划分,没有考虑数据本身的特性。基于此,本章提出一种基于自适应区间划分的模糊关联分类算法—FARC(Fuzzy association rules classification),利用模糊c均值聚类算法根据数据本身的特点自适应地建立模糊区间,并在挖掘模糊关联规则时采用了新的剪枝策略,极大地减少了候选集的数量。新的规则权重度量方法能够更好的利用多模糊关联规则进行分类。对UC Irvine Machine Learning Repository测试数据的实验表明,FARC不仅是具有高精度的分类精度,同时具有对训练样本数量的不敏感性,在训练样本减少的情况下仍能保持较好的分类精度,是一种有效的分类方法。同时,本文将模糊关联分类算法引入遥感图像分类问题的研究中,在实际遥感分类问题中,训练样本往往较难获取,训练样本的不足会导致分类精度的下降,本文提出的FARC算法能够较好的适应训练样本较低情况下的分类问题,从而能够很好的应用于实际遥感分类问题。
With the increase of human ability of using information technology to produce and collect data,the scale of data inflates rapidly.It is very important to discover the hidden and unknown knowledge in the databases.Data mining is a powerful tool to solve these problems. Association rules mining is an important filed of data mining.In a sense,association rules mining is the essence of data mining.The research and application of it occupy an important proportion of data mining research and have been developed rapidly.The research on how to mine the association rules from the massive databases efficiently and use them reasonably is of great theoretical and practical significance.Based on the analysis of current mining algorithms,an inter-transaction frequent itemsets mining algorithm and an intra-transaction frequent itemsets mining algorithm are proprosed to solve the problem of remote sensing image classification.In this dissertation,the research work can be summarized as the following three aspects:
     1.The research on the fast mining algorithm of complete frequent itemsets.Most of modern complete frequent itemsets mining algorithms are based on the Apriori algorithm, called Apriori-like algorithms.When generating candidate itemsets,they need to check if any two itemsets have the same n-1 items and when counting the support,the whole or part of the databases needs to be scanned one by one,which wastes a lot of CPU time and I/O operations. The two problems are the main bottlenecks of the Apriori-like algorithms.According to the two problems,the dissertation proposes a special data structure named BitTable and its bitwise operation.BitTable is adopted to compress databases and generate candidate itemsets quickly by the bitwise And/Or operation to avoid scaning databases.It also horizontally compresses the candidate itemsets and frequent itemsets,and generates candidate itemsets directly to avoid the operation of comparing each item.This data structure can be applied in Apriori-like algorithms directly and improve their performance effectively.Moreover,an association rules mining algorithm named BitTableFI is proposed based on BitTable.The experiment results demonstrate the effectiveness of the BitTableFI algorithm.
     2.The research on inter-transaction frequent closed itemsets and its fast mining algorithm. Compared with intra-transaction frequent itemsets,the inter-transaction frequent itemsets can effectively reveal the relevance of various attributes at different moments,and are the expansion of intra-transaction frequent itemsets.However,the amount of inter-transaction frequent itemsets increases rapidly with the increase of sliding time window,which will reduce the efficiency of the mining algorithm.It can effectively reduce the amount of itemsets without loss of information to utilize closed itemsets to represent inter-transaction frequent itemsets.This dissertation proposes an inter-transaction frequent closed itemsets mining algorithm,by analyzing the internal relation between the inter-transaction and the intra-transaction frequent itemsets.The proposed algorithm adopts division and condition database technology to avoid the generation of huge extended database,utilizes the extended BitTable to compress the transaction and improves the counting efficiency of the support. Dynamic ordering and hash table decrease the testing times of the candidate closed inter-transacation itemsets.Simulations show that the algorithm is a fast and efficient inter-transaction frequent closed itemsets mining algorithm.
     3.The research on fuzzy associative classification and its application on remote sensing image classification.Associative classification utilizes association rules to solve the classification problem.Fuzzy concept is introduced to associative classification,which can avoid the problem of "sharp boundary".However,most of fuzzy associative classification algorithms adopt the fixed membership function to generate fuzzy sets,without considering the intrinsic characteristic of data.To address this issue,the dissertation proposes a fuzzy associative classification algorithm FARC based on the adaptive interval partition.According to the intrinsic characteristic of data,FARC employs fuzzy c-means to partition continuous attributes,adopts new jointing and pruning technique to avoid generating unuseful candidate itemsets and introduces a weighted parameter to score the fuzzy association rules.The experiments on UCI datasets show that the method proposed in this dissertation not only has a higher classification accuracy,but also is insensitive to the variation of amount of the training data set.In this dissertation,the fuzzy associative classification is introduced to the research on remote sensing image classification.However,in the actual remote sensing applications, training data is hard to obtain,which affects the classification accuracy of traditional classifiers greatly.The proposed algorithm FARC can effectively overcome the problem of lacking training data set in the actual remote sensing classification and get high classification accuracy.
引文
[1]R.A.McDonald,P.Skipp,J.Bennell,C.Potts,L.Thomas,and C.D.O'Connor.Mining whole-sample mass spectrometry proteomics data for biomarkers-An overview [J].Expert Systems with Applications,2009,36 (3):5333-5340.
    [2]M.S.Chen,J.W.Han,and P.S.Yu.Data mining:An overview from a database perspective [J].IEEE Transactions on Knowledge and Data Engineering,1996,8 (6):866-883.
    [3]S.J.Lee and K.Siau.A review of data mining techniques [J].Industrial Management & Data Systems,2001,101 (1-2):41-46.
    [4]J.A.Harding,M.Shahbaz,Srinivas,and A.Kusiak.Data mining in manufacturing:A review [J].Journal of Manufacturing Science and Engineering-Transactions of the Asme,2006,128 (4):969-976.
    [5]X.D.Wu,V.Kumar,J.R.Quinlan,J.Ghosh,Q.Yang,H.Motoda,G.J.McLachlan,A.Ng,B.Liu,P.S.Yu,Z.H.Zhou,M.Steinbach,D.J.Hand,and D.Steinberg.Top 10 algorithms in data mining [J].Knowledge and Information Systems,2008,14 (1):1-37.
    [6]J.L.Balcazar.Minimum-Size Bases of Association Rules [C].Proceedings of Machine Learning and Knowledge Discovery in Databases,2008:86-101,692.
    [7]C.Vens,J.Struyf,L.Schietgat,S.Dzeroski,and H.Blocked.Decision trees for hierarchical multi-label classification [J].Machine Learning,2008,73 (2):185-214.
    [8]H.K.Tripathy,B.K.Tripathy,and P.K.Das.An Intelligent Approach of Rough Set in Knowledge Discovery Databases [C].Proceedings of World Academy of Science,Engineering and Technology,2007:495-498,788.
    [9]U.Kies,T.Mrosek,and A.Schulte.A statistics-based method for cluster analysis of the forest sector at the national and subnational level in Germany [J].Scandinavian Journal of Forest Research,2008,23 (5):445-457.
    [10]X.B.Hu and E.Di Paolo.An efficient genetic algorithm with uniform crossover for air traffic control [J].Computers & Operations Research,2009,36 (1):245-259.
    [11]S.H.Liao,H.H.Ho,and H.W.Lin.Mining stock category association and cluster on Taiwan stock market [J].Expert Systems with Applications,2008,35 (1-2):19-29.
    [12]D.Haughton,J.Deichmann,A.Eshghi,S.Sayek,N.Teebagy,and H.Topi.A review of software packages for data mining [J].American Statistician,2003,57 (4):290-309.
    [13]K.Wang.Applying data mining to manufacturing:the nature and implications [J].Journal of Intelligent Manufacturing,2007,18 (4):487-495.
    [14]E.W.Klee.Data mining for biomarker development:A review of tissue specificity analysis [J].Clinics in Laboratory Medicine,2008,28 (1):127-143.
    [15]P.Compieta,S.Di Martino,M.Bertolotto,F.Ferrucci,and T.Kechadi.Exploratory spatio-temporal data mining and visualization [J].Journal of Visual Languages and Computing,2007,18 (3):255-279.
    [16]A.V.Lotov.Visualization-based selection-aimed data mining with fuzzy data [J].International Journal of Information Technology & Decision Making,2006,5 (4):611-621.
    [17]R.S.Segall and Q.Y.Zhang.Data visualization and data mining of continuous numerical and discrete nominal-valued microarray databases for bioinformatics[J].Kybernetes,2006,35(9-10):1538-1566.
    [18]M.Raty and A.Kangas.Localizing general models with classification and regression trees[J].Scandinavian Journal of Forest Research,2008,23(5):419-430.
    [19]J.L.Menendez,F.Rindi,J.M.Rico,and M.D.Guiry.The use of CHAID classification trees as an effective descriptor of the distribution of Rosenvingiella radicans(Prasiolales,Chlorophyta) in urban environments[J].Cryptogamie Algologie,2006,27(2):153-164.
    [20]S.Akhtar.A proposed model to use ID3 algorithm in the classifier of A network intrusion detection system[C].Proceedings of the INMIC 2005:9th International Multitopic Conference,2005:205-212,788.
    [21]P.He,L.Chen,and X.H.Xu.Fast C4.5[C].Proceedings of 2007 International Conference on Machine Learning and Cybernetics,2007:2841-2846,4236.
    [22]Q.L.Guo and Q.Tang.Data Mining based on Improved Neural Network and its Application in Fault Diagnosis of Steam Turbine[C].2008 7th World Congress on Intelligent Control and Automation,2008:4051-4056,9402.
    [23]R.Sabzevari and G.A.Montazer.An intelligent data mining approach using neuro-rough hybridization to discover hidden knowledge from information systems[J].Journal of Information Science and Engineering,2008,24(4):1111-1126.
    [24]K.Umamaheswari,S.Sumathi,S.N.Sivanandam,and K.K.N.Anburajan.Efficient finger print image classification and recognition using neural network data mning[C].2007 International Conference of Signal Processing,Communications and Networking,2006:426-432,635.
    [25]Z.Chen,D.H.Zhao,and J.Ruan.Almost periodic attractor for Cohen-Grossberg neural networks with delay[J].Physics Letters A,2009,373(4):434-440.
    [26]A.Assareh and M.H.Moradi.Knowledge acquisition from mass spectra of blood samples using fuzzy decision tree and genetic algorithm[C].2007 9th International Symposium on Signal Processing and Its Applications,2007:408-411,1471.
    [27]H.Ahn and K.J.Kim.Using genetic algorithms to optimize nearest neighbors for data mining[J].Annals of Operations Research,2008,163(1):5-18.
    [28]R.K.Gupta,A.K.Bhunia,and S.K.Goyal.An application of Genetic Algorithm in solving an inventory model with advance payment and interval valued inventory costs [J].Mathematical and Computer Modelling,2009,49(5-6):893-905.
    [29]T.Tunnukij and C.Hicks.An Enhanced Grouping Genetic Algorithm for solving the cell formation problem[J].International Journal of Production Research,2009,47(7):1989-2007.
    [30]Y.Mei and L.S.Zhu.High Efficiency Association Rules Mining Algorithm for Bank Cost Analysis[C].Proceedings of the International Symposium on Electronic Commerce and Security,2008:57-60,1075.
    [31]Y.Y.Zhao,H.Jiang,B.Y.Sun,and X.J.Dong.Improved Apriori Algorithm Based on Weighted Mining Association Rules[C].Proceedings of Dcabes 2008,2008:433-436,1457.
    [32]K.Luo,L.L.Wang,and X.J.Tong.Mining association rules in incomplete information systems[J].Journal of Central South University of Technology,2008,15(5):733-737.
    [33]A.Tajbakhsh,M.Rahmati,and A.Mirzaei.intrusion detection using fuzzy association rules[J].Applied Soft Computing,2009,9(2):462-469.
    [34]J.F.Ding and S.S.T.Yau.TCOM,an innovative data structure for mining association rules among infrequent items[J].Computers & Mathematics with Applications,2009,57(2):290-301.
    [35]J.W.Han,H.Cheng,D.Xin,and X.F.Yan.Frequent pattern mining:current status and future directions[J].Data Mining and Knowledge Discovery,2007,15(1):55-86.
    [36]D.Taniar,W.Rahayu,V.Lee,and O.Daly.Exception rules in association rule mining [J].Applied Mathematics and Computation,2008,205(2):735-750.
    [37]Y.L.Chen and C.H.Weng.Mining fuzzy association rules from questionnaire data [J].Knowledge-Based Systems,2009,22(1):46-56.
    [38]D.Sanchez,M.A.Vila,L.Cerda,and J.M.Serrano.Association rules applied to credit card fraud detection[J].Expert Systems with Applications,2009,36(2):3630-3640.
    [39]K.F.Jea and M.Y.Chang.Discovering frequent itemsets by support approximation and itemset clustering[J].Data& Knowledge Engineering,2008,65(1):90-107.
    [40]J.X.Yu,Z.H.Li,and G.M.Liu.A data mining proxy approach for efficient frequent itemset mining[J].Vldb Journal,2008,17(4):947-970.
    [41]M.Wojciechowski,K.Galecki,and K.Gawronek.Three strategies for concurrent processing of frequent itemset queries using FP-growth[J].Knowledge Discovery in Inductive Databases,2007,4747:240-258,301.
    [42]R.Agrawal,T.Imielinski,and A.Swami.Mining association rules between sets of items in large databases[C].Proceedings of the 1993 ACM-SIGMOD international conference on management of data,Washington,DC,USA,1993:207-216.
    [43]欧阳为民,蔡庆生.在数据库中发现具有时态约束的关联规则[J].软件学报,1999,10(5):527-532.
    [44]R.Srikant and R.Agrawal.Mining generalized association rules[C].Proceeding of the 1995 international conference on very large data bases,1995:407-419.
    [45]D.W.Cheung,J.W.Han,V.T.Ng,and C.Y.Wong.Maintenance of discovered association rules in large databases:An incremental updating technique,New Orleans,LA,USA,1996:106-114.
    [46]J.S.Park,M.-S.Chen,and P.S.Yu.Efficient parallel data mining for association rules,Baltimore,MD,USA,1995:31-36.
    [47]S.Sarawagi,S.Thomas,and R.Agrawal.Integrating association rule mining with relational database systems:Alternatives and implications,Seattle,WA,USA,1998:343-354.
    [48]Z.J.Xie,H.Chen,and C.P.Li.An efficient algorithm for frequent itemset mining on data streams[J].Advances in Data Mining,2006,4065:474-491.
    [49]W.Song,B.R.Yang,and Z.Y.Xu.Index-CloseMiner:An improved algorithm for mining frequent closed itemset[J].Intelligent Data Analysis,2008,12(4):321-338.
    [50]L.Wen and M.Q.Li.A new association rules mining algorithms based on directed itemsets graph [J].Rough Sets,Fuzzy Sets,Data Mining,and Granular Computing,2003,2639:660-663.
    [51]K.Y.Huang,C.H.Chang,and K.Z.Lin.PROWL:An efficient frequent continuity mining algorithm on event sequences [J].Data Warehousing and Knowledge Discovery,Proceedings,2004,3181:351-360.
    [52]R.Agrawal and R.Srikant.Fast algorithms for mining association rules in large databases [C].Proceedings of the 1994 international conference on very large data bases,Santiago,Chile,1994:487-499.
    [53]H.Mannila,H.Toivonen,and A.I.Verkamo.Efficient algorithms for discovering association rules [C].Proceeding of the AAAI' 94 workshop knowledge discovery in databases,Seattle,WA,1994:181-192.
    [54]A.Savasere,E.Omiecinski,and S.Navathe.An efficient agorithm for mining association rules in large databases [C].Proceeding of the 1995 international conference on very large data bases,Zurich,Switzerland,1995:432-443.
    [55]J.S.Park,M.S.Chen,and P.S.Yu.Using a hash-based method with transaction trimming for mining association rules [J].IEEE Transactions on Knowledge and Data Engineering,1997,9 (5):813-825.
    [56]H.Toivonen.Sampling large databases for association rules [C].Proceedings of the 1996 international conference on very large data bases,Bombay,India,1996:134-145.
    [57]S.Brin,R.Motwani,J.D.Ullman,and S.Tsur.Dynamic itemset counting and implication rules for market basket data [C].Proceeding of the 1997 ACM-SIGMOD international conference on management of data,Tucson,AZ,USA,1997:255-264.
    [58]J.Pei,J.W.Han,B.Mortazavi-Asl,J.Y.Wang,H.Pinto,Q.M.Chen,U.Dayal,and M.C.Hsu.Mining sequential patterns by pattern-growth:The PrefixSpan approach [J].IEEE Transactions on Knowledge and Data Engineering,2004,16(11):1424-1440.
    [59]Y.H.Guo,Y.H.Tong,S.W.Tang,and D.Q.Yang.A FP-tree-based method for inverse frequent set mining [J].Flexible and Efficient Information Handling,2006,4042:152-163.
    [60]X.L.Ma,Y.H.Tong,S.W.Tang,and D.Q.Yang.Efficient incremental maintenance of frequent patterns with FP-tree [J].Journal of Computer Science and Technology,2004,19 (6):876-884.
    [61]J.L.Koh and S.F.Shieh.An efficient approach for maintaining association rules based on adjusting FP-Tree structures [J].Database Systems for Advanced Applications,2004,2973:417-424.
    [62]M.J.Zaki.Scalable algorithms for association mining [J].IEEE Transactions on Knowledge and Data Engineering,2000,12 (3):372-390.
    [63]J.Pei,J.Han,H.Lu,S.Nishio,S.Tang,and D.Yang.H-Mine:Fast and space-pre serving frequent pattern mining in large databases [J].HE Transactions,2007,39 (6):593-605.
    [64]J.Q.Liu,Y.H.Pan,K.Wang,and J.W.Han.Mining frequent item sets by opportunistic projection [C].Proceedings of the 8th ACM SIGKDD international conferenceon knowledge discovery and data mining,Alberta,Canada,2002:229-238.
    [65]R.C.Agarwal,C.C.Aggarwal,and V.V.V.Prasad.A tree projection algorithm for generation of frequent item sets[J].Journal of Parallel and Distributed Computing,2001,61(3):350-371.
    [66]刘君强,潘云鹤.基于混合投影的频繁模式挖掘算法[J].计算机研究与发展,2003.40(10):1488-1498.
    [67]N.Pasquier,Y.Bastide,R.Taouil,and L.Lakhal.Discovering frequent closed itemsets for association rules[J].Lecture Notes in Computer Science,1999,1540:398-416.
    [68]G.Grahne and J.Zhu.Efficiently using prefix-trees in mining frequent itemsets[C].Proceeding of the ICDM' 03 international workshop on frequent itemset mining implementations,Melbourne,FL,2003:123-132.
    [69]G.M.Liu and H.J.Lu.AFOPT:An efficient implementation of pattern growth approach[C].FIMI'03 Workshop on frequent itemset mining implementations,Florida,USA,2003:
    [70]Y.Bastide,N.Pasquier,R.Taouil,G.Stumme,and L.Lakhal.Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets[C].Proceedings of First International Conference on Computational Logic,London,UK,2000:972-986.
    [71]N.Pasquier,Y.Bastide,R.Taouil,and L.Lakhal.Efficient mining of association rules using closed itemset lattices[J].Information Systems,1999,24(1):25-46.
    [72]H.D.K.Moonesinghe,S.Fodeh,and P.N.Tan.Frequent closed itemset mining using prefix graphs with an efficient flow-based pruning strategy[C].Proceedings of ICDM 2006:Sixth International Conference on Data Mining,2006:426-435.
    [73]H.G.Fu and M.O.Foghlu.A distributed algorithm of density-based subspace frequent closed itemset mining[C].Proceedings of 10th Ieee International Conference on High Performance Computing and Communications,2008:750-755.
    [74]G.Liu,H.Lu,W.Lou,and J.X.Yu.On computing,storing and querying frequent patterns[C].Proceeding of the 2003 ACM SIGKDD international conference on knowledge discovery and data mining,Washington,DC,2003:607-612.
    [75]N.Pasquier,Y.Bastide,R.Taouil,and L.Lakhal.Pruning clsoed itemset lattices for association rules[C].Proceedings of BDA Conference,Hammamet,Tunisie,1998:177-196.
    [76]M.J.Zaki and C.J.Hsiao.Efficient algorithms for mining closed itemsets and their lattice structure[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(4):462-478.
    [77]J.Pei,J.W.Han,and R.Mao.CLOSET:an efficient algorithm for mining frequent closed itemsets[C].Proceeding of the 2000 ACM-SIGMOD international workshop data mining and knowledge discovery,Dallas,TX,2000:11-20.
    [78]J.Y.Wang,J.W.Han,and J.Pei.CLOSET+:searching for the best strategies for mining frequent closed itemsets.[C].Proceeding of the 2003 ACM SIGKDD international conference on knowledge discovery and data mining,Washington,DC,2003:236-245.
    [79]陶利民,黄林鹏.Cherry:一种无须子集检查的闭合频繁集挖掘算法[J].软件学报,2008,19(2):379-388.
    [80]刘君强,孙晓莹,庄越挺,潘云鹤.挖掘闭合模式的高性能算法[J].软件学报,2004,15(1):94-102.
    [81]宋威,杨炳儒,徐章艳,高静.一种改进的频繁闭项集挖掘算法[J].计算机研究与发展,2007,45(2):278-286.
    [82]H.Wang,Q.H.Li,C.X.Ma,and K.L.Li.A maximal frequent itemset algorithm[J].Rough Sets,Fuzzy Sets,Data Mining,and Granular Computing,2003,2639:484-490.
    [83]L.Zhuang and H.H.Dai.A maximal frequent itemset approach for web document clustering[C].Proceedings of Fourth International Conference on Computer and Information Technology,2004:970-977.
    [84]D.I.Lin and Z.M.Kedem.Pincer-search:An efficient algorithm for discovering the maximum frequent set[J].IEEE Transactions on Knowledge and Data Engineering,2002,14(3):553-566.
    [85]J.Robero and B.J.R.Efficiently mining long patterns from databases[C].Proceedings of the 1998 ACM SIGMOD international conference on Management of data Seattle,Washington,United States,1998:85-93.
    [86]W.Song,B.Yang,and Z.Xu.Index-maxminer:A new maximal frequent itemset mining algorithm[J].International Journal on Artificial Intelligence Tools,2008,17(2):303-320.
    [87]R.Zhibo,Z.Qiang,and M.Xiujuan.Efficiently mining maximal frequent itemsets based on digraph[C].Proceedings of Fourth International Conference on Fuzzy Systems and Knowledge Discovery,Haikou,China,2007:140-143.
    [88]S.M.Chung and C.N.Luo.Efficient mining of maximal frequent itemsets from databases on a cluster of workstations[J].Knowledge and Information Systems,2008,16(3):359-391.
    [89]H.F.Li and S.Y.Lee.Approximate mining of maximal frequent itemsets in data streams with different window models[J].Expert Systems with Applications,2008,35(3):781-789.
    [90]C.N.Luo,A.L.Pereira,and S.M.Chung.Distributed mining of maximal frequent itemsets on a Data Grid system[J].Journal of Supercomputing,2006,37(1):71-90.
    [91]路松峰,卢正鼎.快速开采最大频繁项目集[J].软件学报,2001,12(2):293-297.
    [92]宋余庆,朱玉全,孙志挥.基于FP-Tree的最大频繁项目集挖掘及更新算法[J].软件学报,2003,14(9):1586-1592.
    [93]D.Burdick,M.Calimlim,J.Flannick,J.Gehrke,and T.Yiu.MAFIA:A maximal frequent itemset algorithm[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(11):1490-1504.
    [94]K.Gouda and M.J.Zaki.GenMax:An efficient algorithm for mining maximal frequent itemsets[J].Data Mining and Knowledge Discovery,2005,11(3):223-242.
    [95]颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222.
    [96]Y.L.Chen and L.T.H.Hung.Using decision trees to summarize associative classification rules[J].Expert Systems with Applications,2009,36(2):2338-2351.
    [97]F.Thabtah and P.Cowling.Mining the data from a hyperheuristic approach using associative classification[J].Expert Systems with Applications,2008,34(2):1093-1101.
    [98]R.Rak,L.Kurgan,and M.Reformat.A tree-projection-based algorithm for multi-label recurrent-item associative-classification rule generation [J].Data & Knowledge Engineering,2008,64 (1):171-197.
    [99]F.Thabtah.A review of associative classification mining [J].Knowledge Engineering Review,2007,22(1):37-65.
    [100]L.K.Wickramasinghe,L.D.Alahakoon,and K.Smith-Miles.A novel Episodic Associative Memory model for enhanced classification accuracy [J].Pattern Recognition Letters,2007,28 (10):1193-1202.
    [101]B.Liu,W.Hsu,and Y.M.Ma.Integrating Classification and Association Rule Mining [C].Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining,New York,USA,1998:80-86.
    [102]G.Dong and J.Li.Efficient mining of emerging patterns:discovering trends and differences [C].Proceeding of the 1999 international conference on knowledge discovery and data mining,San Diego,CA,1999:43-52.
    [103]X.Yin and J.W.Han.CPAR:classification based on predictive association rules [C].Proceeding of the 2003 SIAM international conference on data mining,San Fransisco,CA,2003:331-335.
    [104]G.Cong,K.-L.Tan,A.K.h.tung,and X.Xu.Mining top-k covering rule groups for gene expression data,Baltimore,MD,United States,2005:670-681.
    [105]J.Wang and G.Karypis.HARMONY:efficiently mining the best rules for classification.[C].Proceeding of the 2005SIAMconference on data mining,NewportBeach,CA,2005:205-216.
    [106]J.R.Quinlan.Induction of decision trees [J].Machine Learning,1986,1:81-106.
    [107]D.Kalles and T.Morris.Efficient incremental induction of decision trees [J].Machine Learning,1996,24 (3):231-242.
    [108]R.E.Fan,P.H.Chen,and C.J.Lin.Working set selection using second order information for training support vector machines [J].Journal of Machine Learning Research,2005,6:1889-1918.
    [109]C.Cortes and V.Vapnik.Support-Vector Networks [J].Machine Learning,1995,20(3):273-297.
    [110]W.Li,J.W.Han,and J.Pei.CMAR:accurate and efficient classification based on multiple class-association.[C].Proceeding of the 2001 international conference on data mining,San Jose,CA,2001:369-376.
    [111]M.L.Antonie and O.R.Zai'ane.An associative classifier based on positive and negative rules [C].Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery,Paris,France,2004:64-69.
    [112]D.Janssens,G.Wets,T.Brijs,and K.Vanhoof.Adapting the CBA algorithm by means of intensity of implication [J].Information Sciences,2005,173 (4):305-318.
    [113]F.Berzal,J.C.Cubero,D.Sanchez,and J.M.Serrano.ART:A hybrid classification model [J].Machine Learning,2004,54 (1):67-92.
    [114]Y.G.Sucahyo and R.P.Gopalan.Building a more accurate classifier based on strong frequent patterns [C].Proceedings of 17th Australian Joint Conference on Artificial Intelligence,Cairns,Australia,2004:1036-1042.
    [115]G.Q.Chen,H.Y.Liu,L.Yu,Q.Wei,and X.Zhang.A new approach to classification based on association rule mining[J].Decision Support Systems,2006,42(2):674-689.
    [116]F.A.Thabtah and P.I.Cowling.A greedy classification algorithm based on association rule[J].Applied Soft Computing,2007,7(3):1102-1111.
    [117]W.Li,J.Han,and J.Pei.CMAR:accurate and efficient classification based on multiple class association rule[C].Proceedings of the 2001 IEEE International Conference on Data Mining,ICDM'01,San Jose,CA,2001:369-376.
    [118]邹晓峰,陆建江,宋自林.基于模糊分类关联规则的分类系统[J].计算机研究与发展,2003,40(5):651-656.
    [119]L.I.Kuncheva.How good are fuzzy if-then classifiers?[J].IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics,2000,30(4):501-509.
    [120]Y.C.Hu,R.S.Chen,and G.H.Tzeng.Mining fuzzy association rules for classification problems[J].Computers & Industrial Engineering,2002,43(4):735-750.
    [121]Y.C.Hu,R.S.Chen,and G.H.Tzeng.Finding fuzzy classification rules using data mining techniques[J].Pattern Recognition Letters,2003,24(1-3):509-519.
    [122]K.Nozaki,H.Ishibuchi,and H.Tanaka.Adaptive fuzzy rule-based classification systems[J].IEEE Transactions on Fuzzy Systems,1996,4(3):238-250.
    [123]Y.C.Hu and G.H.Tzeng.Elicitation of classification rules by fuzzy data mining[J].Engineering Applications of Artificial Intelligence,2003,16(7-8):709-716.
    [124]F.Coenen and P.Leng.The effect of threshold values on association rule based classification accuracy[J].Data & Knowledge Engineering,2007,60(2):345-360.
    [125]Y.C.Hu.Determining membership functions and minimum fuzzy support in finding fuzzy association rules for classification problems[J].Knowledge-Based Systems,2006,19(1):57-66.
    [126]T.Shintani and M.Kitsuregawa.Mining algorithms for sequential patterns in parallel:Hash based approach[J].Research and Development in Knowledge Discovery and Data Mining,1998,1394:283-294.
    [127]吉根林,赵斌,孙志挥.利用Hash树生成频繁项目集的新方法[J].小型微型计算机系统,2004,25(10):1841-1843.
    [128]郑吉平,秦小麟.数据挖掘中采样技术的研究[J].系统工程与电子技术,2005,27(11):1946-1949.
    [129]G.H.John and P.Langley.Static versus dynamic sampling for data mining[C].Proceedings of the Second International Conference on Knowledge Discovery in Databases and Data Mining,,Portland,Oregon,1996:367-370.
    [130]B.Chen and P.Hass.A new two- phase sampling based algorithm for discovering association rules[C].Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Edmonton,Canada,2002:462-468.
    [131]陈慧萍,朱峰,王建东,周小芹.一种基于划分的带项目约束的频繁项集挖掘算法[J].系统工程与电子技术,2006,28(7):1082-1086.
    [132]D.W.Cheung,V.T.Ng,A.W.Fu,and Y.J.Fu.Efficient mining of association rules in distributed databases[J].IEEE Transactions on Knowledge and Data Engineering,1996,8(6):911-922.
    [133]R.Agrawal and J.C.Shafer.Parallel mining of association rules [J].IEEE Transactions on Knowledge and Data Engineering,1996,8 (6):962-969.
    [134]E.H.Han,G.Karypis,and V.Kumar.Scalable parallel data mining for association rules [J].IEEE Transactions on Knowledge and Data Engineering,2000,12 (3):337-352.
    [135]Y.J.Tsay and J.Y.Chiang.CBAR:an efficient method for mining association rules [J].Knowledge-Based Systems,2005,18 (2-3):99-105.
    [136]J.D.Holt and S.M.Chung.Mining association rules using inverted hashing and pruning [J].Information Processing Letters,2002,83 (4):211-220.
    [137]Y.J.Li,P.Ning,X.S.Wang,and S.Jajodia.Discovering calendar-based temporal association rules [J].Data & Knowledge Engineering,2003,44 (2):193-218.
    [138]R.Martinez,N.Pasquier,and C.Pasquier.GenMiner:mining non-redundant association rules from integrated gene expression data and annotations [J].Bioinformatics,2008,24 (22):2643-2644.
    [139]C.M.Huang,T.P.Hong,and S.J.Horng.Simultaneously mining inter-and intra-object association rules [C].Proceeding of 2006 IEEE International Conference on Granular Computing,2006:679-682,814.
    [140]Q.Li,L.Feng,and A.Wong.From intra-transaction to generalized inter-transaction:Landscaping multidimensional contexts in association rule mining [J].Information Sciences,2005,172 (3-4):361-395.
    [141]A.J.T.Lee and C.S.Wang.An efficient algorithm for mining frequent inter-transaction patterns [J].Information Sciences,2007,177 (17):3453-3476.
    [142]C.Berberidis,L.Angelis,and I.Vlahavas.PREVENT:An algorithm for mining inter-transactional patterns for the prediction of rare events [J].Stairs 2004,2004,109:128-136,265.
    [143]Z.Zhang,H.W.Wang,and G.C.Huang.A New Algorithm Based on Matrix for Mining Inter-Transaction Association Rules [C].Proceeding of 2007 International Conference on Wireless Communications,Networking and Mobile Computing,2007:6717-6720,6743.
    [144]H.Chhinkaniwala and P.S.Thilagam.InterTARM:FP-tree based Framework for Mining Inter-transaction Association Rules from Stock Market Data [C].Proceedings of the International Conference on Computer Science and Information Technology,2008:513-517,994.
    [145]A.J.T.Lee,C.S.Wang,W.Y.Weng,Y.A.Chen,and H.W.Wu.An efficient algorithm for mining closed inter-transaction itemsets [J].Data & Knowledge Engineering,2008,66 (1):68-91.
    [146]H.Lu,L.Feng,and J.Han.Beyond intratransaction association analysis:Mining multidimensional intertransaction association rules [J].Acm Transactions on Information Systems,2000,18 (4):423-454.
    [147]H.J.Lu,J.W.Han,and L.Feng.Stock movement prediction and n-dimensional inter-transaction association rules.[C].Proceedings of the SIGMOD'98 Workshop on Research Issues on Data Mining and Knowledge Discovery,Seattle,USA,1998:121-127.
    [148]A.K.H.Tung,H.J.Lu,J.W.Han,and L.Feng.Efficient mining of intertransaction association rules[J].IEEE Transactions on Knowledge and Data Engineering,2003,15(1):43-56.
    [149]L.Feng,H.J.Lu,J.X.Yu,and J.W.Han.Mining inter-transaction association rules with templates[C].Proceedings of ACM CIKM International Conference Information and Knowledge Management,Kansas City,USA,1999:223-225.
    [150]L.Feng,T.Dillon,and J.Liu.Inter-transactional association rules for multi-dimensional contexts for prediction and their application to studying meteorological data[J].Data & Knowledge Engineering,2001,37(1):85-115.
    [151]秦亮曦,史忠植.多时间序列跨事务关联分析研究[J].计算机工程与应用,2005,27:10-12.
    [152]C.Lucchese,S.Orlando,and R.Perego.Fast and memory efficient mining of frequent closed itemsets[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(1):21-36.
    [153]陈耿,朱玉全,杨鹤标.关联规则挖掘中若干关键技术的研究[J].计算机研究与发展,2005,42(10):1785-1789.
    [154]C.H.Park and M.Lee.A SVM-based discretization method with application to associative classification[J].Expert Systems with Applications,2009,36(3):4784-4787.
    [155]Y.Y.Zhang and H.X.Jiao.An associative classification-based recommendation system for personalization in B2C e-commerce applications[J].Expert Systems with Applications,2007,33(2):357-367.
    [156]M.Ceci and A.Appice.Spatial associative classification:propositional vs structural approach[J].Journal of Intelligent Information Systems,2006,27(3):191-213.
    [157]Y.Yoon and G.G.Lee.Efficient implementation of associative classifiers for document classification[J].Information Processing & Management,2007,43(2):393-405.
    [158]Y.Lan,D.Janssens,G.Q.Chen,and G.Wets.Improving associative classification by incorporating novel interestingness measures[J].Expert Systems with Applications,2006,31(1):184-192.
    [159]F.A.Thabtah,P.Cowling,and Y.H.Peng.Multiple labels associative classification [J].Knowledge and Information Systems,2006,9(1):109-129.
    [160]T.D.Do,S.C.Hui,and A.C.M.Fong.Associative classification with prediction confidence[J].Advances in Machine Learning and Cybernetics,2006,3930:199-208.
    [161]F.Thabtah,P.Cowling,and S.Hammoud.Improving rule sorting,predictive accuracy and training time in associative classification[J].Expert Systems with Applications,2006,31(2):414-426.
    [162]K.Wang,S.Zhou,and Y.He.Growing decision trees on support-less association rules[C].Proceedings of the KDD,Boston,MA,2000:265-269.
    [163]E.Baralis and P.Garza.A lazy approach to pruning classification roles[C].Proceedings of the IEEE 2002 International Conference on Data Mining,Maebashi City,Japan,2002:35-42.
    [164]邹晓峰,陆建江,宋自林.语言值关联规则挖掘算法[J].系统仿真学报,2002,14(9):1130-1132.
    [165]R.Nishii and S.Eguchi.Supervised image classification by contextual AdaBoost based on posteriors in neighborhoods[J].IEEE Transactions on Geoscience and Remote Sensing,2005,43(11):2547-2554.
    [166]L.Bruzzone and D.F.Prieto.Unsupervised retraining of a maximum likelihood classifier for the analysis of multitemporal remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing,2001,39(2):456-460.
    [167]R.Tonjes,S.Growe,J.Buckner,and C.E.Liedtke.Knowledge-based interpretation of remote sensing images using semantic nets[J].Photogrammetric Engineering and Remote Sensing,1999,65(7):811-821.
    [168]刘伟,崔宝侠.广义LVQ算法及其在遥感影像分类中的应用研究[J].电子与信息学报,2007,7:1201-1203.
    [169]M.K.Pakhira,S.Bandyopadhyay,and U.Maulik.Validity index for crisp and fuzzy clusters[J].Pattern Recognition,2004,37(3):487-501.
    [170]G.B.Huang,Q.Y.Zhu,and C.K.Siew.Extreme learning machine:Theory and applications[J].Neurocomputing,2006,70(1-3):489-501.
    [171]C.W.T.Yeu,M.H.Lim,G.B.Huang,A.Agarwal,and Y.S.Ong.A new machine learning paradigm for terrain reconstruction[J].IEEE Geoscience and Remote Sensing Letters,2006,3(3):382-386.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700