用户名: 密码: 验证码:
基于DW/DM的地税发票综合业务分析决策系统的研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着数据仓库和OLAP技术及数据挖掘技术研究和开发的不断深入,决策支持系统(DSS)的发展也跃上一个新的台阶。开发综合决策支持系统主要以数据仓库(DataWarehouse)技术为基础,以联机分析处理(OLAP)和数据挖掘(DataMining)工具为手段进行实施。DSS不是一种通用的产品,而是一个解决方案。实施DSS,通常遵循总体规划、分步实施、迅速受益、不断完善的原则。
     本文首先介绍了数据仓库的概念、实现和发展,数据挖掘的概念、方法和技术等。接着,将上述理论运用于实践,基于政府税收部门面临的新挑战:提供更好的客户服务,加强征管力度,改善资源管理,并准确预测相关法规及政策的效果。本文提出了地税发票综合业务分析决策系统的解决方案。该分析决策系统通过运用OLAP技术和数据挖掘技术对地税发票综合业务数据仓库中的数据进行多方位,多层次的分析,为税务机关对发票综合业务的分析决策提供了一种手段。
     在阐述地税发票综合业务分析决策系统的解决方案的过程中,本文首先提出该系统设计的总体框架,然后依次分析各个子模块的实现:在数据仓库结构设计中采用了DB-DW-DMs体系结构(独立Data Marts),数据预处理通过DTS工具完成,对多维数据集的OLAP分析和数据挖掘运用了多维表达式MDX与Microsoft ADO MD数据对象等技术,报表处理与生成通过使用支持PivotTable Service的客户端应用程序Microsoft Excel 2000来实现,数据挖掘模块通过运用DSO编程实现数据挖掘模型的创建和训练,及以可视化的方式展现分析结果等。接着列举实例说明系统功能,最后,总结研究工作并提出下一步工作的方向。
With the rapid development in Data Warehouse, OLAP and Data Mining techniques, great progress has been achieved in Decision Support System (DSS). An integrated DSS is mainly based on a Data Warehouse platform, with the use of OLAP and Data Mining tools. DSS is a solution rather than a product. An accepted rule in the realization of DSS is to layout a system structure, implement the system incrementally, benefit from the very beginning, and improve the design over time.
    hi this thesis, the concepts, implementation and development of Data Warehouse and Data mining are introduced briefly. The focus of the thesis then centers on applying these techniques to the development of an Invoice Integrated Service Analysis and Decision System of Local Tax Bureau. As it is well known, there has been great demand in meeting the new challenges that governmental revenue departments are facing: better client services, strengthened tax administration, improved resource management and the requiremet in evaluating new tax rules and policies. The proposed system utilizes OLAP and Data Mining techniques to analyze data in Invoice Integrated Service Data Warehouse on a multi-dimension and multi-level approach. It provides a solution for local tax bureau management to perform analysis and decision-making about Integrated Service of Invoice
    A framework has been built for the solution of Invoice Integrated Service Analysis and Decision System of Local Tax Bureau. Modules within this framework have been generated. The design of the data warehouse uses the DB-BW-DMs (Independent Data Marts) structure. Data pre-processing is dealt with the DTS tools. For OLAP-analyzing and data mining on multidimensional data, the Multidimensional Expressions MDX, and Microsoft ADO MD data object techniques are used. Report processing and generation are realized by using Microsoft Excel 2000, which is one of the client application programs that support PivotTable Service. Data Mining Module realizes creating and training data mining model programmatically by using DSO (Decision Support Object). And results of analysis are showed by visualized mode. Examples have been given through the thesis to demonstrate the
    concepts and support the design. Conclusions have been provided at the end of the thesis to summarize the current work, and to point out future research directions.
引文
[1] Jennifer Widom, Research Problems in Data Warehousing Int'l Conference on Information and Knowledge Management (CIKM), Nov. 1995
    [2] M.C. Wu, A.P. Buchmann, Research Issues in Data Warehousing BTW'97, Ulm, March, 1997
    [3] W. J. Labio, D. Quass, B. Adelberg. "Physical Database Design for Data Warehouses." In Proceedings of the International Conference on Data Engineering, Binghamton, UK, April 1997.
    [4] J.Hammer, H.Garcia-Molina, J.Widom, W. Labio, Y. Zhuge. "The Stanford Data Warehousing Project." In IEEE Computer Society Bulletin of the Technical Committee on Data Engineering, June 1995.
    [5] R. Kimball. Mastering Data Extraction. In DBMS Magazine, June 1996.
    [6] Arkady Maydanchik Challenges of Efficient Data Cleansing Published in DM Direct in September 1999
    [7] M. Hernandez and S. Stolfo, Real-World Data is Dirty: Data Cleansing and The Merge/Purge Problem, Data Mining and Knowledge Discovery, Volume 2, Issue 1, 1998, 9-37.
    [8] V. Harinarayan, A. Rajaraman, J. Ullman. "Implementing Data Cubes Efficiently." In Proceedings of ACM SIGMOD Conference, Montreal, Canada, June 1996.
    [9] Surajit Chaudhuri, Umeshwar Dayal, An Overview of Data Warehousing and OLAP Technology U. Dayal, SIGMOD Volume 26(1), March 1997.
    [10] Ming-Syan Chen and Jiawei Han and Philip S. Yu Data Mining: an overview from a database perspective IEEE Trans. On Knowledge ,And Data Engineering volume 8,1996
    [11] J. Hipp, U. Guntzer, and G. Nakaeizadeh. Algorithms for Association Rule Mining - A General Survey and Comparison. In Proc.ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2000.
    [12] Jain, A.K., Murty M.N., and Flynn P.J. Data Clustering: A Review,ACM Computing Surveys, Vol 31, No. 3, 264-323.1999
    [13] Tian Zhang, Raghu Ramakrishnan, Miron Livny, BIRCH: A New Data Clustering Algorithm and Its Applications, Data Mining and Knowledge Discovery, Volume 1, Issue 2, 1997, 141-182
    [14] R. Agrawal and R. Srikant, Mining Sequential Patterns, Proc. of the Int'l Conference on Data Engineering (ICDE), Taibei, Taiwan, March 1995.
    [15] Sreerama K. Murthy Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey Data Mining and Knowledge Discovery, 1998
    [16] 兰琼,周静,周定康,周琪云,地方税务局发票管理系统的开发与实现,计算机应用研究第19卷,第五期,2002
    [17] 兰琼,周定康,吴传孙,一种基于分布式数据库模式的应用软件的网络设计与实现 计算机与
    
    现代化.2003年第2期,2003
    [18] 陈海文,杨燕燕,邵维忠,数据仓库体系结构研究,第14届全国数据库学术会议论文集,1997.
    [19] 胡侃,夏绍玮,基于大型数据仓库的数据采掘:研究综述,软件学报.vol 9.no.1.1998.
    [20] 周永锋,邓苏,杨强,刘青宝,基于DTS对象模型的DTS包实现,计算机应用第22卷第11期.2002
    [21] 田永青,陈卫华,朱仲英,税务系统中数据仓库平台的设计,微型电脑应用第17卷第5期,2001
    [22] 钟建英,税务系统中如何选用预测决策模型,计算机与现代化2001年第2期,2001
    [23] 王愚,邵后印,陈勇孝,郎洪,段银田,数据仓库技术在税务征管业务中的应用,郑州工业大学学报第21卷第4期,2000
    [24] Jiawei Han and Micheline Kamber,Data Mining Concepts and Techniques,高等教育出版,2001.
    [25] W.H.Inmon,Ken Rudin著 王志海 译,Building the Data Warehouse,机械工业出版社,2000.
    [26] 彭木根,数据仓库技术与实现电子工业出版社,2002.
    [27] 罗运模 等,SQL Server 2000数据仓库应用与开发人民邮电出版社,2001.
    [28] 章立明,SQL Server 2000完全实战—数据转换服务(DTS)中国铁道出版社,2002
    [29] Mike Gunderloy,Tim Sneath著 张伟 宋霞译,SQL Server开发指南—OLAP(联机分析处理)电子工业出版社,2001
    [30] Seidman,C.著;刘艺,王鲁军,蒋丹丹等译,SQL Server 2000 数据挖掘技术指南 机械工业出版社,2002
    [31] C J Date. An Introduction to Database Systems [M]. Seventh Edition. 机械工业出版社,2002
    [32] Data Warehousing at Stanford http://www-db.stanford.edu/warehousing/index.html
    [33] Data Mining at Stanford http://www-db.stanford.edu/midas/midas.html
    [34] Data Warehouse Information Center http://www.dwinfocenter, org/
    [35] Data Extraction, Transformation, and Loading Techniques http://www.microsoft.com/technet/treeview/default.asp ?ur1=/technet/prodtechnol/sql/reskit/sq12000/part4/c1461.asp
    [36] Microsoft Data Mining Algorithms http://research.microso ft.com/dmx/DataMining/default.asp

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700