用户名: 密码: 验证码:
基于并行Apriori的物流路径频繁模式研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research on Logistics Path Frequent Patterns Based on Parallel Apriori
  • 作者:曹菁菁 ; 任欣欣 ; 徐贤浩
  • 英文作者:CAO Jingjing;REN Xinxin;XU Xianhao;College of Logistics Engineering, Wuhan University of Technology;School of Management, Huazhong University of Science and Technology;
  • 关键词:大数据 ; 频繁路径 ; Hadoop ; Fuzzy ; c-means聚类算法 ; Apriori算法
  • 英文关键词:big data;;frequent path;;Hadoop;;Fuzzy c-means clustering algorithm;;Apriori algorithm
  • 中文刊名:JSGG
  • 英文刊名:Computer Engineering and Applications
  • 机构:武汉理工大学物流工程学院;华中科技大学管理学院;
  • 出版日期:2018-08-30 09:17
  • 出版单位:计算机工程与应用
  • 年:2019
  • 期:v.55;No.930
  • 基金:国家自然科学基金重点国际(地区)合作与交流项目(No.71620107002);国家自然科学基金青年项目(No.61502360)
  • 语种:中文;
  • 页:JSGG201911040
  • 页数:8
  • CN:11
  • 分类号:262-269
摘要
传统的频繁路径挖掘分析主要通过关联规则算法实现,但其在处理大型数据集时,会产生占用内存过多,数据处理速度慢等问题,对此提出一种基于Fuzzy c-means聚类算法的并行Apriori算法模型。该模型通过Fuzzy c-means算法完成对原始数据集的聚类分析,将同一区域的物流路径数据划分到内部相似度较高的数据类,并利用Apriori算法对各数据类中的频繁模式进行挖掘分析,进而获得各区域的物流频繁路径。同时通过Hadoop平台实现算法的并行化,有效提高算法运行效率和质量。通过对物流频繁路径的挖掘分析,使管理者更清楚货物流向,可为配送路径优化等决策提供支持。
        The traditional method of frequent path mining analysis is realized by the association rule algorithm. However,when dealing with large data sets, the traditional association rules algorithm will take up too much memory and process data slowly. In this paper, a parallel Apriori algorithm based on Fuzzy c-means clustering algorithm is proposed. The model performs clustering analysis of the original data set by Fuzzy c-means algorithm, divides the logistics path data which is considered as the same district into a data cluster with high similarity. Then the model utilizes the Apriori algorithm to mine the frequent paths in this district, so as to obtain the frequent logistics path of each area. Meanwhile, the algorithm is parallelized through the Hadoop platform, which can effectively improve the efficiency and the quality of the algorithm.Through the analysis of the frequent path of logistics, managers can better understand the flow of goods and make the decision of the optimization of the delivery path.
引文
[1] Tsai C F,Lin W C,Ke S W.Big data mining with parallel computing:a comparison of distributed and MapReduce methodologies[J].Journal of Systems&Software,2016,122:83-92.
    [2] Ramamoorthy S,Rajalakshmi S.Optimized data analysis in cloud using BigData analytics techniques[C]//Fourth International Conference on Computing,Communications and Networking Technologies,2014:1-5.
    [3] Agarwal P,Shroff G,Malhotra P.Approximate incremental big-data harmonization[C]//IEEE International Congress on Big Data,2013:118-125.
    [4]苏桂贤.数据挖掘技术在物流配送管理中的应用[J].赤峰学院学报(自然科学版),2013,29(11):76-77.
    [5]过杭斌.数据挖掘及其在物流运输系统中的应用研究[J].物流技术,2011,30(9):79-81.
    [6]郑军,金贻,鄢吉多,等.数据挖掘技术在物流管理中的应用[J].贵阳学院学报(自然科学版),2013,8(2):32-34.
    [7] Wu X,Fan W,Peng J,et al.Iterative sampling based frequent itemset mining for big data[J].International Journal of Machine Learning and Cybernetics,2015,1(6):1-8.
    [8] Luo W,Tan H,Chen L,et al.Finding time period-based most frequent path in big trajectory data[C]//ACM SIGMOD International Conference on Management of Data,2013:713-724.
    [9] Zhou F.The longest frequent path mining of digraph and its application[J].International Journal of Advancements in Computing Technology,2013.
    [10]何柏英.云计算环境下物流路径数据挖掘研究[D].合肥:合肥工业大学,2013.
    [11]唐颖峰,陈世平.一种面向分布式数据流的闭频繁模式挖掘方法[J].计算机应用研究,2015,32(12):3560-3564.
    [12]程军锋.基于人工蜂群算法的数据流聚类研究[J].首都师范大学学报(自然科学版),2015,36(6):24-29.
    [13]于彦伟,王沁,邝俊,等.一种基于密度的空间数据流在线聚类算法[J].自动化学报,2012,38(6):1051-1059.
    [14]杨俊瑶.基于物联网的物流路径规划与频繁路径挖掘的研究[D].南宁:广西大学,2014.
    [15] Ruspini E H.A new approach to clustering[J].Information&Control,1969,15(1):22-32.
    [16] Bezdek J C.Pattern recognition with fuzzy objective function algorithms[M].Norwell,MA,USA:Kluwer Academic Publishers,1981.
    [17]樊哲.Mahout算法解析与案例实战[M].北京:机械工业出版社,2014.
    [18] Srikant R,Agrawal R.Mining generalized association rules[C]//International Conference on Very Large Data Bases,1995:407-419.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700