用户名: 密码: 验证码:
Native XML数据库存储的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着XML相关标准的推广与应用,Web上出现了大量的XML文档。为了有效的管理,有必要将XML文档存储到数据库中。存储方案己成为XML数据管理领域研究的一个重要课题。NativeXML数据库充分考虑到XML数据的特点,以一种自然的方式来处理XML数据,能从各方面很好地支持XML的存储和查询,并且能够达到较好的效果,所以Native XML数据库存储方式是很有研究价值的。
     本文从XML基础知识入手,深入研究了当前Native XML数据的存储方法。目前的存储策略中XML数据的存储性能不高,并且支持查询和更新不是十分高效。本文针对这些不足给出一套较为通用的半结构化信息的NativeXML文档存储策略WNXD及其实现方案。WNXD是采用半结构化信息存取机制,其具有以下特点:基于记录和基于分页的存储模式相结合,以最大程度保证底层各类形式异构和内容混杂的数据存取的有效性和完整性,并减小了I/O查询次数;动态倒排技术和数据映像机制的引入,实现对XML文档的有效支持,并在一定程度上提高系统处理性能:结合XML特点,提出一种新的存储模型,提供对XML数据的查询以及更新操作的更好支持;三种结构化索引(倒排表、地址表、记录内容表)的建立,为XML数据提供从整体到局部,从元素、属性到具体值的全面覆盖访问策略。
     最后,本文对WNXD存储策略进行程序实现,并且与比较成功的NativeXML数据库系统进行了性能对比,实验结果表明,WNXD存储策略对XML数据具有更好的支持。
With the growing popularity and application of XML related standards, large repositories of XML documents have emerged on the Web. It is necessary to store these documents into a database to make them manageable. Storage schemes have become an important research topic in the XML data management field. Native XML database could take into account the characteristics of XML data fully, deal with the XML data in a natural way, support storage and query well in all aspects, and achieve good results. Therefore, the native XML database storage means a great deal of research value.
     This thesis is based on acknowledge of XML and relation database, and researches deeply the storing method of Native XML Database. But Current storage method is not good at storaging data of XML in the time and space, meanwhile they can not support High-performance update, and so on. Based on the viewpoint mentioned above, this thesis brings forward a uniform access strategy (WNXD), WNXD is a tentative exploration to the mechanism research in access to the unstructured information. Mainly includes techniques and theories as follows: The combination of the records-based storage model with the pages-based to ensure the efficiency and integrality of the data retrieval for all sorts of heterogeneous and context mixed data to the fullest extent, and reduce the I/O number; Dynamically inverted and merging combined with the data mapping mechanism, supports the XML document efficiently and can improve the operation capability of the system to some extent; Combine the characteristics of XML, a new storage model for the provision of XML data query and update operations to better support; The definition of three indexes(table of retrieval, table of page and value table)provides a multilevel access strategy for XML data from all the document to the partial and then to the element.
     In this paper, procedures WNXD storage strategy implementation, and performance compared with success Native XML DataBase in today's, at last. The results show that the WNXD has better performance compared with the classical method.
引文
[1]万常选著 XML数据库技术 清华大学出版社:25-93页
    [2]Akrnal B.Chaudhri,Awais Rashid,Roberto Zicari编著:邢晓春,张志强,李骅竞等译。XML数据管理NativeXML和支持XML的数据库系统。清华大学出版社。2006.2第一次印刷,第一版:14-65页
    [3]孟小峰,王宇等。OrientX:一个Native XML数据库系统的实现策略,第20届全国数据库学术会议,2003.10,计算机科学,卷30(10):111-115页
    [4]罗道峰,孟小峰,等。OrientStore:NativeXML存储方法[J]。计算机科学,2003.30(增刊):105-110。
    [5]非结构化数据管理与知识提炼[EB/OL]。http://www.e-works.net.cn/ewkArticles/Category40/Articlel4368.htm。2005-09-20。
    [6]FengTian,David JDeW itt,etal.TheDesign and Performance Evaluation of AlternativeXML Storage Strategies[J].ACM SIGMOD Record,2002,31(1):5-10
    [7]M Carey,D DeW itt,et al.The BUCKY Object-Relational Bench-mark[C].Proc.ofACM SIGMOD1997,New York:ACM Press,1997:12-21
    [8]M Carey,D DeW itt,et al.Shoring up PersistentApplications[C].Proc.ofACM SIGMOD1994,NewYork:ACM Press,1994.383-394
    [9]BerkeleyDB Toolkit,http://www.sleepycat.com/
    [10]余永平,朱卫东。NXD研究与应用[J]。现代计算机,2004.1:21-24页
    [11]J.McHugh,J.Widom.Compile-time Path Expansion in Lorer.Workshop on Query Processing for Semistructured Data and Non- Standard Data Formats[A].Jerusalem[C].Israel,1999
    [12]Quanzhong Li,Bongld Moon.Indexing and Querying XML Data for Regular Path Expressions[A].Proceeding of the 27th VLDB Conference [C].Roma,Italy,2001.
    [13]吕建华,王国仁,于戈。XML数据的路径表达式查询优化技术[J]。软件学报,2003,09:1615-1620。
    [14]Guoren Wang,Bing Sun,Jianhua Lv,etc.RPE Query Processing and Optimization Techniques for XML Databases[J].J.Comput.Sci.Technol 19(2)(2004):224-237.
    [15]Jing Wang,Xiaofeng Meng,Shan Wang.Integrating Path Index with Value Index for XML Data[A].APWeb 2003[C].LNCS,2003,2642:95-100
    [16]FengTian,David JDeW itt,etal.TheDesign and Performance Evaluation of AlternativeXML Storage Strategies[J].ACM SIGMOD Record,2002,31(1):5-10
    [17]D Florescu,D Kossman.Storing and Querying XML Data Using an RDBMS[C].Proc.ofACM SIGMOD2002,NewYork:ACM Press.2002.204-215.
    [18]http://www.ibm.com
    [19][美]Kevin Dick著,邓尚贤译,清华大学出版社XML.管理者指南(第二版)。2003.06,P20-24
    [20]The Design and Performance Evaluation of Alternative XML Storage Strategies
    [21]冯建华,钱乾,廖雨果,李国良,塔娜,周立柱,清华大学计算机科学与技术系:纯XML数据库研究综述*:3-5页
    [22]夏海静,XML技术浅析。太原大学教育学报。2007,25(1):154-155页
    [23]JMchugh,S Abiteboul et al.Lore:A database management system for semistructured data.ACM SIGMOD Record.2007,26(3):54-56P
    [24]崔清华。XML文档在关系数据库中的存储研究。微计算机信息。2007,23(24):184-186页
    [25]李骥,陈福生。Native-XML数据库综述。计算机工程与设计。2004,6(25):932-935页
    [26]莫佳,XML数据关系存储技术。重庆工学报,2007.21(9),128-13页
    [27]李占波,李娜。XML数据在关系数据库中的存储,微计算机信息。2007.23(9):192-194页
    [28]费丽娟,李芸,XML与关系数据库的数据转换研究,科技情报开发与经济。2007.17(21):194-195页
    [29]李新燕。基于关系的XML存储技术。现代计算机。2007.8(2):47-48页
    [30]Rener.XML data and object databases,The perfect couple.ICDE.2001.21(7):143-148P
    [31]Jisim Kim,WolYoung Lee,Kiho Lee.The Cost Model for XML Documents in Relational DataBase Systems.IEEE.2001:185-187P
    [32]P Boharmon,J Freire Etal.From XML Schema to Relations;A Cost-Based Approach to XML Storage.Proc of 18th Intl Conf on Data Engineering.San Jose,California,USA,IEEE Computer Society.2002,64-75P
    [33]Peter G.Aitken著,谢君英译。微软XML技术指南。北京:中国电力出版社,2003,1-83页
    [34]Widom.Data Management for XML,Research directions.In IEEE Data Eng.Bull.1999,22(3),44-52P
    [35]Wenyue Du,Mong Li Lee,Tok Wang ling.XML Structures for Relational Data.IEEE 2002:1-13P
    [36]R.Bourret,J.Cowan,I.Macherius,S.St.Laurent:Document Definition Markup Language(DDML)Specification,Version 1.O。北京:清华大学出版社,2001:1-23P
    [37]王茹,宋瀚涛。XML文档结构定义规范一XML Schema。计算机应用研究。2002,1(1):127-129页
    [38]Akmal B.Chaudhri.Awais Rashid,Roberto Zicari著,形春晓,张志强,李骅竞等译。XML数据管理:NativeXML和支持XML的数据库系统.北京:清华大学出版社,2006:30-56页
    [39]袁升发。基于关系模式的XML数据存储技术研究。计算机工程与应用. 2006,27(5):175-178页
    [40]许卓明,刘琴,董逸生。基于关系数据库的XML数据存储技术评选。计算机工程与应用。2003,21(1):197-201页
    [41]Erwin Leonardi,Sourav S.Bhowmick.A scalable change detection technique for ordered XML documents using relational databases.Science Direct.2006:476-507页
    [42]Sihem Amer-Yahia,Fang Du,Juliana Freire.A comprehensive Solution to the XML-To-Relational Mapping Problem.ACM.2006:1-17P
    [43]刘翔,程文青,刘威。一种从XML建立关系数据库的模式映射方法。计算机技术与发展。2007,17(2):1-7页
    [44]娄芳,于海雯。XML数据存储方式的决策研究。科技广场。 2005,1(1):95-96页
    [45]Florescu D.and D.Kossmann.Storing and Querying XML Data using an RDBMS.IEEE Data Engineering Bulltin.1999.22(3):27-34P
    [46]D.Lee and W.W.Chu.CPI:Constraints-Preserving Inling Algorithm for Mapping XML DTD to Relational Schema.J.Data&Knowledge Engeering.vol.39,No.October 2001:1-8P
    [47]徐慧,施化吉,李星毅,鞠时光。一种基于RDBMS的XML数据的存储方法。计算机工程与应用。2004,27(1):160-164页
    [48]罗军珍。XML在关系数据库中存储方法的研究。计算机工程与应用。2003:1-56页
    [49]高明霞,董英斌,陈福荣.一种XML Schema到关系数据库模式的转换方法及实现。计算机应用研究。2003,6(2):154-157页
    [50]李俊.XML数据存储映射模型研究。计算机工程与应用。2006,4(2):24-26页
    [51]付志祥,王晓东。XML数据在关系数据库存储中的应用。微计算机信息。2003,12:30-31页

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700