用户名: 密码: 验证码:
工作流实例成批处理模式的挖掘算法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
工作流是一类能够完全或部分自动执行的业务过程,活动是工作流中的一个逻辑步骤或者环节。工作流活动的成批处理,是指通过将同一类型活动的多个工作流活动实例进行合并处理,从而使原本多个活动实例的分别执行成组合并执行。成批处理可以降低活动执行成本和提高活动执行效率,但现有的工作流系统几乎没有考虑到成批处理的问题,也没有提供这一方面的支持。
     尽管支持成批处理的工作流系统及成批处理的工作流模式已引起关注并开展了探索性研究,但仍有很多工作需要进一步开展与完善。例如该系统在运行时需要完成下列设定:1)哪个/哪些活动可进行成批处理;2)哪个/哪些活动处于同一成批处理区;3)成批处理区中活动的成批处理方式。但事实上,如同合理恰当地定义工作流模型一样,上述设定工作在实践中如果完全只依赖工作流模型设计人员完成,则不仅要求设计者具备大量的相关业务经验知识,而且容易受建模者主观经验的影响,耗时且容易出错。
     针对上述问题,本文提出了工作流实例成批处理模式的挖掘问题,并开展了深入研究。本文的工作主要从两个方面展开:1)对于现有工作流系统,研究怎样利用其工作流日志识别可成批处理的工作流活动及其可能的成批处理方式,进而识别出成批处理区;2)对于支持成批处理工作流系统,研究怎样从其工作流日志中挖掘出包含成组处理类型工作流活动的工作流模型及其相应的成组工作流子过程。
     论文的主要内容与贡献是:1)描述了隐含在工作流过程中的可成批处理的工作流活动,并定义了其成批处理特征;2)提出了适用于现有工作流系统的工作流活动成批处理特征识别算法及活动成批处理区识别算法;3)定义了适用于支持成批处理工作流系统的活动实例成批处理模式,并提出了相应的活动实例成批处理模式挖掘算法;4)进行了大量的仿真实验,且仿真实验证明,这些算法能较好的解决不同工作流环境下的活动实例成批处理问题,有利于支持成批处理的工作流系统的实际应用。
A workflow is a partial or total automation of a business process, in which a collection of activities is a logical unit or step of work. Batch processing in workflow is the accomplishment of several requirements in one action by combining multiple workflow activity cases together, while in traditional approaches, they are handled by several actions. Batch processing of activity cases based on certain rules can usually economize system resources and improve the system efficiency, but it is almost neglected by current workflow and very little support is offered in current workflow systems.
     Exploratory research on the workflow management systems supporting dynamic batch processing (DBP) and different DBP patterns in workflow has been done and it has drawn many researcher's attention. However, there are still some problems need to be investigated. The following three problems are not solved yet: 1) determination of which activity deserves batch processing (hereafter batch-deserving activity); 2) if it is a batch-deserving activity, how the multiple cases of the activity are batch-processed; and 3) the setting of the batch processing areas (batch processing patterns). In fact, just like the definition of workflow models, the modeling of batch-deserving activity as well as batch processing area is also a time-consuming and error-prone task. They can be easily influenced by the perception of business process designer (hereafter designer). Moreover, designers may not know exactly which activity deserves batch processing at workflow building time since there is no real data at that time to give us confidence. Even more designers may ignore these kinds of batch processing features due to certain reasons and this happens very often. To optimize business processes, therefore, it is badly important to explore a way to identify and model both batch-deserving activities and batch processing areas automatically.
     Aiming at solving the above problems, this paper proposes and explores the problem of mining batch processing patterns from workflow logs. This paper has carried on a research into it from the following two aspects. First, in current workflow environment, does research on the identification of batch-deserving activities and their batch processing features as well as recognition of batch processing areas in workflow. Second, in workflow system supporting workflow activities' batch processing, does research on the mining of workflow models with batch processing activities and its responding sub-process.
     The main contents and contribution of this paper includes: 1) describe the batch-deserving activity in workflow and define its batch processing features; 2) propose a algorithm for identification of activity batch processing features and a algorithm for recognition of batch processing areas in workflow in current workflow environments; 3) make a definition of batch processing pattern in workflow system supporting workflow activities' batch processing and propose a responding algorithm for mining workflow activity cases' batch processing pattern; 4) several simulative experiments are done to verify algorithms' effectiveness. The experimental result shows these algorithms can solve the problems of activity cases' batch processing in different workflow environments and facilitate the application of workflow system supporting workflow activities' batch processing.
引文
[1] J.X. Liu, H.Y. Chen and J.M. Hu. Batch Processing in Workflow: the Model and the Implementation[C], Proc. of the 1th International Conference on Semantics, Knowledge and Grid, 2005.11, Beijing, IEEE Publisher. [2] J.X. Liu and J.M. Hu. Dynamic Batch Processing in Workflows: Model and implementation [J], Future Generation Computer Systems, 2007, 23(3):338-347.
    [3] W.M.P. van der Aalst, B.F. van Dongen, J. Herbst, L. Maruster, G. Schimm, and A.J.M.M. Weijters. Workflow Mining: A Survey of Issues and Approaches[J], Data and Knowledge Engineering, 2003, 47(3):237-267.
    [4] J.E. Cook and A.L. Wolf. Discovering models of software processes from event-based data[J], ACM Transactions on Software Engineering and Methodology, 1998, 7 (3):215-249.
    [5] R. Agrawal, D. Gunopulos, and F. Leymann. Mining process models from workflow logs[C], Proceedings of the Sixth International Conference on Extending Database Technology, 1998, pp.469-483.
    [6] W.M.P. van der Aalst, T. Weijters, and L. Maruster. Workflow Mining: Discovering Process Models from Event Logs[J], IEEE Transactions on Knowledge and Data Engineering, 2004, I6(9):1128-1142.
    [7] J. Herbst. Inducing workflow models from workflow instances[C], Proceedings of the 6th European Concurrent Engineering Conference, Society for Computer Simulation(SCS), 1999, 175-182.
    [8] J. Herbst and D. Karagiannis. Workflow mining with InWoLve[J]. Computers in Industry, 2004, 53(3):245-264.
    
    [9] G. Schimm. Mining exact models of concurrent workflows[J]. Computers in Industry, 2004, 53(3):265-281.
    [10]G. Greco, A. Guzzo, G. Manco, and D. Sacca. Mining and Reasoning on Workflows[J], IEEE Transactions on Knowledge and Data Engineering, 2005, 17(4):519-534.
    [11]A. Weijters, W.V. Aalst. Rediscovering workflow models from event-based data[C], Proceedings of the 11th DutchBelgian Conference on Machine Learning, Benelearn, 2001, 93-100.
    [12]W.M.P. van der Aalst, R. Michael, D. Marlon. Deadline-based escalation in process-aware information systems[J], Decision Support Systems, Vol.43, 2007:492-511.
    [13]M. Hammori, J. Herbst, and N. Kleiner. Interactive workflow mining— requirements, concepts and implementation[J]. Data and Knowledge Engineering, 2006, 56(1):41-63.
    [14] J. Herbst. Dealing with concurrency in workflow induction[C], Proceedings of the 7th European Concurrent Engineering Conference, Society for Computer Simulation(SCS), 2000, 169-174.
    [15]J. Herbst and D. Karagiannis. Integrating machine learning and workflow management to support acquisition and adaptation of workflow models[J], International Journal of Intelligent Systems in Accounting, Finance and Management, 2000, 9(2):67-92.
    [16]San-Yih Hwang, Chih-Ping Wei, Wan-Shiou Yang. Discovery of temporal patterns from process instances[J], Computers in Industry, 53(2004), 345-364.
    [17]M.H.Jansen-Vullers,W.M.P.van der Aalst,M.Rosemann.Mining configurable enterprise information systems[J],Data&Knowledge Engineering,56(2006)195-244.
    [18]Schahram Dustdar,Thomas Hoffmann,Wil van der Aalst.Mining of ad-hoc business processes with TeamLog[J],Data&Knowledge Engineering,55(2005)129-158.
    [19]W.M.P.van der Aalst,A.J.M.M.Weijters.Process mining:a research agenda[J],Computers in Industry,53(2004)231-244.
    [20]WfMC(Workflow Management Coalition).1995.WfMC Workflow Reference Model,WFMC-TC00-1003(issue 1.1),19 Jan,1995.
    [21]Hollingsworth D.The Workflow Reference Model:10 Years On.In:Workflow Handbook 2004.http://www.wfmc.org.2004.2.
    [22]Georgakopoulos D.,Hornick M.and Sheth A.An Overview of Workflow Management:From Process Modeling to Workflow Automation Infrastructure[J],Distributed and Parallel Databases,1995,3:119-153.
    [23]S.Pinar,H.T.Ismail.An architecture for workflow scheduling under resource allocation constraints[J],Information Systems,2005,30:399-422.
    [24]W.M.P.van der Aalst,A.H.M.ter Hofstede,B.Kiepuszewski,and A.P.Barros.Workflow Patterns[J].Distributed and Parallel Databases,2003,14(3):5-51.
    [25]Zhang HP,Yu HK,Xiong Dy,Liu Q.HHMM-Based Chinese 1exical analyzer ICTCLAS[C].In:Proc.of the 2nd SigHan Workshop.2003,184-187.
    [26]Dubes R C,Jain A K.Algorithms for Clustering Data[M].Prentice Hall,1988.
    [27]Kaufman L,Rousseeuw P J.Finding Groups in Data:An Introduction to Cluster Analysis.John Wiley and Sons,1990.
    [28]SALTON G,WONGA,YANG CS.A Vector Space Model for Automatic Indexing[J].Communication of the ACM,1975,18(5):613-620.
    [29]J.B.MacQueen.Some methods for classification and analysis of multivariate observations[C].In:proceedings of the 5th Berkeley Symposium on mathematics Statistical Problem,1967,1:281-297.
    [30]Carlos Ordonez,Edward Omiecinski.Efficient disk-based K-means clustering for relational databases.IEEE Transactions on Knowledge and Data Engineering,2004.
    [31]Carlos Ordonez.Integrating K-means clustering with a relational DBMS using SQL.IEEE Transactions on Knowledge and Data Engineering,2006.
    [32]Shazia Sadiq,Maria Orlowska,Wasim Sadiq,et al.Data Flow and Validation in Workflow Modelling[A].Procedings of conference in research and practice in information technology[C].Darlinghurst,New South Walse,Australia:Australian Computer Society,2004.207-214.
    [33]范玉顺.工作流管理技术基础[M].北京:清华大学出版社,2001.
    [34]W.M.P.van der Aalst等著,王建民等译.工作流管理——模型、方法与系统[M].北京:清华大学出版社,2004.
    [35]于永利,朱小冬,张柳.离散事件系统模拟[M].北京:航空航天大学出版社.2003.
    [36]王红卫,建模与仿真[M].北京:科学出版社.2002.
    [37]徐光辉.随机服务系统[M1.北京:科学出版社,1980年第一版.
    [38]康凤举等.现代仿真技术与应用[M].北京:国防工业出版社.2006.1.
    [39]唐应辉,唐小我.排队论—基础与分析技术[M].北京:科学出版社.2006.1.
    [40]严蔚敏,吴伟民.数据结构(C语言版)[M].北京:清华大学出版社.1997.
    [41]罗海滨,范玉顺,吴澄.工作流技术综述[J].软件学报,2000,11(7):899-907.
    [42]史美林,杨广信,向勇,伍尚广.WfMS:工作流管理系统[J].软件学报,1999,22(3):325-334.
    [43]刘怡,张子刚,张戡.工作流模型研究述评[J].计算机工程与设计,2007,28(2):448-451.
    [44]张德壮,李俊海,耿继秀.工作流管理系统综述[J].计算机应用,2000,25(4):34-36.
    [45]陈畅,吴朝晖.工作流管理规范综述[J].计算机科学,2000,27(11):57-59.
    [46]曾炜,阎保平.工作流模型研究综述[J].计算机应用研究,2005,(5),11-13.
    [47]叶茂,赵卫东.一种新的工作流模型挖掘算法[J].计算机集成制造系统,2006,12(11):20-22.
    [48]周波,景新海,王海洋.基于动态工作流网的工作流过程挖掘[J].计算机应用,2005,25(12):196-198.
    [49]刘新瑜,朱卫东.基于过程挖掘的工作流性能分析[J].计算机应用,2005,25(4):915-918.
    [50]马辉,张凯.基于Petri网的工作流挖掘技术分析[J].计算机与现代化,2005,(7),92-95.
    [51]赵静,赵卫东.基于工作流日志挖掘的流程角色识别[J].计算机集成制造系统,2006,12(11):1916-1920.
    [52]陈亮,高建民,陈富民,陈琨,李成.基于工作流挖掘的质量管理过程改进研究[J].计算机集成制造系统,2006,12(4):603-608.
    [53]李燕,冯玉强.工作流挖掘:一种新型工作流自动化建模方法[J].计算机工程,2007,33(4):20-22.
    [54]雷萍,吕英华,余阳.基于数据挖掘的工作流过程优化研究[J].中山大学学报论丛,2007,27(2):231-237.
    [55]刘光远,苑森淼,董立岩,李永丽.基于工作流的数据挖掘PMML模型实现[J].小型微型计算机系统,2007,28(5):891-894.
    [56]李海波,战德臣.工作流中数据流的调度控制[J].计算机集成制造系统,2006,12(11):1909-1915.
    [57]孙瑞志,史美林.工作流活动多实例的调度控制[J],软件学报,2005,16(3):400-406.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700