用户名: 密码: 验证码:
XML数据智能管理若干关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着海量XML数据的涌现与传递,XML已成为互联网上信息表示和数据交换的一个重要标准,继而产生了对XML数据管理的需求,如何有效地表示、查询与挖掘这些XML数据已经成为当前XML数据管理领域遇到的一个重要挑战。
     针对目前XML数据管理研究现状中存在的问题与不足,本文研究了XML数据模型、群体智能、模式识别、神经网络、数据挖掘与智能计算等原理与方法,在原型系统XBASE上提出了一系列基于XML键的数据清洗、查询、数据挖掘等新的智能管理方法,同时探讨了XML重构的有效途径等问题。
     本文围绕XML数据的查询与数据挖掘等智能管理问题展开研究,研究内容和取得的成果主要体现在以下四方面:
     1.XML数据管理框架—XPDM的建立
     现有的XML数据模型存在着四个问题影响了XML数据的有效管理,即:(1)数据的异构:给多数据源集成带来许多困难,影响了信息查询的有效性;(2)数据的非一致性:由于数据约束的不完整性,常导致数据前后不一致,影响数据查询的准确性;(3)多数据源之间数据依赖关系的不确定性:影响数据之间的归并与查询;(4)语义标准的规范:由于XML正处于发展之中,许多规范还不完善,往往导致了查询语句的繁琐与混乱。
     针对以上问题,本文提出了一种以XML键构建的向量空间模型为基础、利用概率理论进行操作的海量XML数据管理框架—XPDM。该框架通过对XQuery 1.0和XPath 2.0数据模型XDM进行语义规范新扩充及XML数据矢量转换,较好地解决了以上四个问题。
     2.数据智能清洗与查询策略
     为了解决XML文档中的“脏数据”问题,通过引入XML键组合及XML向量模型,利用贝叶斯学习方法与马尔可夫链概率转移策略建立XML数据清洗过程的元数据模型,利用XML树相似性判定算法,提出了一种智能清洗XML数据的新方法,通过相应规则库的预定义完成XML数据的清洗;另外为了解决XML数据清洗检测繁锁及灵活性差的问题,提出了通过合理组合XML键、融入粒子群算法、结合隐马尔可夫模型信息抽取策略构建XML数据清洗优化算法;为了提高XML数据查询的智能性与有效性,通过采用启发式方法,结合XML半结构化的特点,将粒子算法与蚁群算法融入到海量XML数据概率查询上,并进行相应改进,实现了数据查询范围的并行处理能力与收敛效率的提高。
     3.XML数据智能挖掘策略
     互联网上已聚集了海量的XML数据,为了有效地对XML数据进行挖掘,本文从以下几个方面进行研究:
     (1)为了提高海量XML文档集的聚类质量,分别以粒子群算法与矩阵迭代自组织算法为基础,提出了基于粒子群的XML自适应混沌聚类算法和基于向量空间模型的矩阵迭代自组织XML辅助聚类算法;
     (2)为了提高海量XML文档集的并行处理能力,根据混沌原理,融入蚁群聚类算法,通过定义相应混沌适应度函数衡量蚂蚁与其邻域的相似程度,提出了一种基于混沌原理与蚁群聚类模型的XML分片算法;
     (3)针对XML数据的流动性和无限性等特点及质量检测存在的不足,提出构建XML键的矢量矩阵作为窗口,利用矢量积小波变换多级分解与重构,再结合最小二乘支持向量机构建双滑动窗口进行XML数据自适应监测算法,满足对XML数据进行网络传递的质量管理要求。
     4.XML智能重构策略
     为了更好地优化XML的语义规范,解决随着用户需求的变化以及时间的推移、XML数据结构也会发生变化这一问题,对XML重构进行了探测性研究。在XML文档片段重构的基础上,利用XML语义约束关系及XML路径层次性,再结合向量机原理与频繁模式的特点,提出了XML频繁模式树XFP-tree算法进行XML结构重构策略,有助于进一步保证XML的质量。
With the emergence of massive XML data and its transmission, XML has been the important standard of the information expression and data exchange on the Internet. So requirements for the XML datamanagement have been evolving and presenting an important challenge in the current XML database domain. Problems as how to express effectively, query and mine these XML data have important values in both theory and application aspects.
     In view of the existent problems and shortages of the XML data management in the present research, this paper has adopted a series of current researches on theories and methods of XML data methods, swarms intelligent principles, pattern recognition, neural networks, data mining and intelligent calculation, and has proposed some renewed intelligent management methods to data cleansing, query, data mining using XML keys based on prototype system XBASE(XML DataBase), simultaneously discusses some efficient methods to the XML refactoring and so on.
     This dissertation focuses on the following four aspects to solve the intelligent management's problem of querying and mining based XML data:
     1. XML data management frame's foundation—XPDM
     The existing XML data model has four problems which affect the effective management of the XML data, they are:(1)heterogeneous data: The dissimilar individual often has the difference to the identiacal data object's naming and the description, which has caused many difficulties to the multi-dataset integration operation and affected the validity of information query; (2)inconsistent data: Without integrality of data restraint, the disagreement data has affected the accuracy of information query; (3)uncertainty of the data dependent's relations among various data sources:It has tampered with the merging and query operations among data sources; (4)standard code of the semantic: Because XML evolved so that many standards are imperfect especially and there is no unified standard so far esulting the query sentence tedious and confusing.
     In view of those questions, the paper has proposed an object oriented massive XML data management frame—XPDM(XML-based Probability Data Model) based on vector space model by XML keys and the probability theory. This frame has solved the four problems above well through carrying on the new expansion of semantic standard to XQuery 1.0 and XPath 2.0 data model (XDM) and the XML data vector conversion.
     2. Intelligent data cleansing and query strategy
     To solve the dirty data problem of XML document, this paper has proposed a new intelligent data cleansing algorithm on the method of XML keys combination and XML vector model, and the strategy of Bayes learning method and the MarKov chain probabilistic model to attempt a new XML data cleansing meta-data model, and on the algorithm of similarity XML trees' checking, which can accomplish the XML data cleansing by predefined rule warehouses. Moreover, in view of the multifarious detection and bad flexibility formerly of the XML data cleansing, the paper has considered an optimization algorithm of XML data cleansing through combining the XML key, combining the PSO algorithm, introducing the hidden Markov model information extraction strategy; Simultaneously the introduction of intelligent algorithm to enhance and the validity of the XML data query, so this paper uses the heuristic method, combining with the XML semi-structured feature, integrates the PSO algorithm and the ACO algorithm in the massive XML probability query, and makes the corresponding improvement, enhances the scope of query and the efficiency of restraining.
     3. Intelligent XML data mining strategy
     In view of the massive XML data has already gathered in the Internet, to carry on the effective mining to the massive XML data, this paper has studied in the direction:
     (1)To enhance the clustering quality of massive XML documents, this paper has proposed a XML document clustering algorithm based on an adaptive PSO with Chaos and a vector matrix iterative self-organizing assistant clustering algorithm of XML document, which bases on the PSO algorithm and the vector space model's matrix iteration;
     (2)To improve the parallel disposal's capability of massive XML document clustering, this paper proposes a parallel xml documents placement algorithm which bases on the chaos principle and an ant clustering model, through defining the corresponding chaos sufficiency function to weight ant with its neighborhood's similar degree;
     (3)In allusion to fluidity and infinity of XML data, and the present insufficiency of quality detection by XML data, the paper has proposed an algorithm which construct the XML key's vector matrix as the window, and restructures the XML data using the vector product wavelet transformation multistage decompositions, recombining the least square support vector machines to construct double sliding window to carry on the query and the monitoring of XML data, the method can adapt the request of the XML data's quality management on network tranfer.
     4. Intelligent XML refactoring strategy
     For optimizing the XML semantic consistency and settling the XML structure transformation with consumer dissimilar request went by time, the paper has proposed the research on intelligent XML refactoring. In view of the XML semantic consistency and its path layer, and uniting the vector machine principle and the frequent pattern's characteristic, the XML frequent pattern XFP-tree algorithm has been considered to carry on the strategy of XML structure refactoring based on the document segment refactoring method, which can more ensure XML quality.
引文
[1]World Wide Web Consortium.Extensible Markup Language(XML)1.0(Fourth Edition).W3C Recommendation.[2006-8-16].http://www.w3.org/TR/REC-xml/
    [2]张晓琳.面向对象的XML数据管理技术研究:[博士学位论文].沈阳:东北大学,2007
    [3]万常选.XML数据库技术.北京:清华大学出版社,2004.1-259
    [4]王国仁,于弋,杨晓春等.XML数据管理技术.北京:电子工业出版社,2007.1-276
    [5]Wang Fusheng,Zaniolo C.Temporal queries and version management in XML-based document archives.Data & Knowledge Engineering.2008,65(2):304-324
    [6]Pardede E,Rahayu J W,Taniar D.XML data update management in XML-enabled database.Journal of Computer and System Sciences,2008,74(2):170-195.
    [7]Prakash S,Bhowmick S S.Efficient recursive XML query processing using relational database systems.Data & Knowledge Engineering,2006,58(2):207-242.
    [8]Min Junki,Lee C H,Chung C W.XTRON:An XML data management system using relational databases.Information and Software Technology.2008,50(5):462-479
    [9]Thomas Kudrass.Management of XML documents without schema in relational database systems.Information and Software Technology.2002,44(4):269-275
    [10]Schoning H.Tamino-A DBMS designed for XML.In Proceedings of the ICDE Conference,Heidelberg,Germany,April,2001,123-142.
    [11]Chamberlin D,Florescu D,Robie J,etal.Qxuery:A query language for xml.January 2007.http://www.w3.org/TR/xquery,
    [12]Nuno B B,Pedro M R,Cruz S.Extensible mark-up language file definition for structured acquisition data storage and transfer.Measurement,2008,41(3):320-326
    [13]谭新良,蔡代纯.基于XML文档检索的搜索引擎设计.计算机科学,2007,34(3):104-106
    [14]孟小峰,王宇,王小锋.XML查询优化研究.软件学报,2006,17(10):2069-2086
    [15]Stavrakas Y,Gergatsoulis M,Doulkeridis C,etal.Representing and querying historyies of semistructured databases using multidimensional OEM.Information Systems,2004,29(6):461-482
    [16]Ben Chang,Rezaur Rahman,Joe Kesselman.Document Object Model(DOM) Level 3 Validation Specification(version 1.0).[2004-1-20].http://www.w3.org/TR/2004/REC-DOM-Level-3-Val-20040127/
    [17]韩东红,杜钰,王国仁等.数据流QoS自适应框架调度算法的研究与实现.小型微型计算机系统,2007,28(9):1637-1640
    [18]王鹏.数据流上的分类算法的研究:[博士学位论文].上海:复旦大学,2007
    [19]刘学军.数据流聚集查询和频繁模式挖掘的研究:[博士学位论文].南京:东南大学,2006
    [20]王桐.XML内容路由关键技术研究:[博士学位论文].沈阳:哈尔滨工程大学,2006
    [21]李晓光.XML非完全结构查询处理中若干关键技术的研究:[博士学位论文].沈阳:东北大学,2006
    [22]徐晓华,陈崚.一种自适应的蚂蚁聚类算法.软件学报,2006,17(9):1884-1889
    [23]王桐,刘大昕.一种基于改进粒子群优化的XML结构聚类方法.小型微型计算机系统,2007,28(5):871-875
    [24]周殊,潘炜,罗斌.一种基于粒子群优化方法的改进量子遗传算法及应用.电子学报,2006,34(5):897-901
    [25]陈安龙,唐常杰,傅彦等.基于能量和频繁模式的数据流预测查询算法.软件学报,2008,19(6):1413-1421
    [26]Tao Yuhui,Hong Tzungpei,Su Yuming.Web usage mining with intentional browsing data.Expert Systems with Applications,2008,34(3):1893-1904
    [27]Mervyn F,Senthil kumar A,Bok S H,etal.Development of an Internet-enabled interactive fixture design system.Computer-Aided Design,2003,35(10):945-957
    [28]陈章,陈志刚.Internet环境下一种基于数据流的构件组装模型.小型微型计算机系统,2006,27(10):1865-1870
    [29]Haustein M,H(a|¨)rder T.An efficient infrastructure for native transactional XML processing.Data & Knowledge Engineering,2007,61(3):500-523
    [30]秦杰.Web环境中半结构化数据存储与查询技术研究:[博士学位论文].长沙:国防科技大学,2005
    [31]杨卫东,王清明,施伯乐.针对XML流数据的复杂Twig Pattern查询处理.软件学报,2007,18(4):893-904
    [32]Fuhr N,Grobjohann K.XIRQL:A language for information retrieval in XML documents,SIGIR,New York:ACM Press,2001:172-180
    [33]Theobald,Weikum G.The index-based XXL search engine for querying XML data with relevance rankings,EDBT,Berlin:Springer,2002.477-495
    [34]徐德智.XML数据库查询及其模式集成研究:[博士学位论文].长沙:中南大学,2004
    [35]Green TJ,Miklau G,Onizuka M,etal.Proeessing XML streams with deterministic auto- mata and stream indexes.ACM Trans.on Database Systems(TODS 2004),2004,29(4):752-788
    [36]Dan Suciu.From searching text to querying XML streams.Journal of Discrete Algorithms,2004,2(1):17-32
    [37]Joseph Fong,San Kuen Cheung,Herbert Shiu.The XML Tree Model-toward an XML conceptual schema reversed from XML Schema Definition.Data & Knowledge Engineering,2008,64(3):624-661
    [38]杨颖.分布式数据流查询处理若干关键技术的研究:[博士学位论文],上海:东华大学,2006
    [39]Sigmod.Sigmod Record.[2007-6-1].http://www.sigmod.org/record/xml/index.xml
    [40]Ma Z M,Li Y.Fuzzy XML data modeling with the UML and relational data models [J].Data & Knowledge Engineering,2007,63(3):972-996
    [41]Magnani M,Montesi D.A unified approach to structured and XML data modeling and manipulation.Data & Knowledge Engineering,2006,59(1):25-62.
    [42]何震瀛,李建中,王朝坤.一种XML数据库的数据模型,软件学报,2006,17(4):759-769.
    [43]Rao M.MSOL partitioning problems on graphs of bounded treewidth and cliquewidth.Theoretical Computer Science,2007,377(3):260-267
    [44]Ogawa M.A linear time algorithm for monadic querying of indefinite data over linearly ordered domains.Information and Computation,2003,186(2):236-259.
    [45]Madria S,Yan C,Passi K,etal.Efficient processing of XPath queries using indexes.Information Systems,2007,32(1):131-159
    [46]熊蜀光.图模型XML数据上查询处理方法的研究:[硕士学位论文].哈尔滨:哈尔滨工业大学,2006
    [47]Stavrakas Y,Gergatsoulis M,Doulkeridis C,etal.Representing and querying histories of semistructured databases using multidimensional OEM,Information Systems,2004,29(6):461-482
    [48]Anders Berglund.Extensible Stylesheet Language(XSL)(version 1.1).W3C[2006-12-5].http://www.w3.org/TR/xs111/
    [49]Leonardi E,Hoai T T,Bhowmick S S,etal.DTD-Diff:A change detection algorithm for DTDs.Data & Knowledge Engineering,2007,61(2):384-402
    [50]Jeong B,Lee D,Cho H,etal.A novel method for measuring semantic similarity for XML schema matching,2008,34(3):1651-1658
    [51]Murata M.RELAX(REgular LAanguage description for XML).[2001-05-15].http://www.xml.gr.jp/relax/
    [52]Fallside D C,Walmsley P.XML Schema(version 2.0),[2004-10-28],http://www.w3.org/TR/xmlschema-0/
    [53]Jelliffe R.Schematron.[2007-1-20].http://www.ascc.net/xml/resource/schematron/
    [54]Klarlund N,Moller A,Schwtzbach M I.DSD:A schema language for XML.In:ACM SIGSOFT workshop on formal methods in software practice.Portland,2000.367-374
    [55]Choi I,Moon B,Kim H J.A clustering method based on path similarities of XML data,Data & Knowledge Engineering,2007,60(2):361-376
    [56]Groppe J,Groppe S.Filtering unsatisfiable XPath queries.Data & Knowledge Engineering,2008,64(1):134-169
    [57]Wang S,Rundensteiner E A,Mani M.Optimization of nested XQuery expressions with orderby clauses.Data & Knowledge Engineering,2007,60(2):303-325
    [58]Chamberlin D,Robie J,Florescu D.Quilt:an XML query language for heterogeneous data sources[A].In:suciu Detal Eds.proceedings of the 3th WebDB international Workshop on the web and databases(lecture notes in computer science,vol.1997).Dallas,Texas,USA.May,2000.Berlin:Springer,2001.1-5
    [59]Mchugh J,Abiteboul S,Goldman R,etal.Lore:A database management system for semistructured data.ACM SIGMOD Record,1997,26(3):54-66
    [60]Su H,Rundensteiner E A,Mani M.Automaton meets algebra:A hybrid paradigm for XML stream processing.Data & Knowledge Engineering,2006,59(3):576-602
    [61]Christophides V,Cluet S,Simeon J.On wrapping query languages and efficient XML integration.in:Chen Wetal Eds.Proceedings of the 19~(th) ACM SIGMOD internation conference on management of datas.Texas,USA.May 2000.New York:ACM Press,2000.141-152
    [62]Fernandez M,Simeon J,Wadler P.A semi-monad for semi-structured data.In:Bussche J V etal Eds.Proceedings of the 8~(th) ICDT international conference on database theory.London,UK.January,2001.Heidelberg:Springer-Verlag,2001,263-300
    [63]Jagadish H V,Lakshmanan L V S,Srivastava D,etal.TAX:A tree algebra for XML.In:Clark J etal Eds.Proceedings of the international workshop on database programming lanaguages.Rome,Italy,September,2001.Heidelberg:springer-Verlag,2002,149-164
    [64]李效东,顾坡清.XML查询的代数表示及其查询优化.计算机科学,2002,29(6):57-63
    [65]徐德智,贾栋,王建新.基于本体的XML语义集成和查询的研究.计算技术与自动化,2007,26(1):77-80
    [66]吕腾,闫萍,王真星.XML函数依赖及其与键的关系.小型微型计算机系统,2005,26(9):1677-1680
    [67]吕腾,闫萍.XML函数依赖及其推理规则.计算机研究与发展,2005,42(5):792-796
    [68]Dehuri S,Patnaik S,Ghosh A,etal.Applieation of elitist multi-objective genetic algorithm for classification rule generation.Applied Soft Computing.2008,8(1):477-487
    [69]高海兵,周驰,高亮.广义粒子群优化模型.计算机学报,2005,28(12):1980-1987
    [70]夏娜,蒋建国,魏星等.改进型蚁群算法求解单任务Agent联盟.计算机研究与发展.2005,42(5):734-739
    [71]燕忠.基于蚁群优化算法的若干问题的研究[博士学位论文].南京:东南大学,2005
    [72]李晓磊,邵之江,钱积新.一种基于动物自治体的寻优模式:鱼群算法.系统工程理论与实践,2002,22(11):32-38
    [73]陈岐,章春芳.适应的并行蚁群算法.小型微型计算机系统,2006,27(9):1695-1699
    [74]高永超.智能优化算法的性能及搜索空间研究:[博士学位论文].济南:山东大学,2007
    [75]Paenke I,Branke J,Jin Y C.Efficient Search for Robust Solutions by Means of Evolutionary Algorithms and Fitness Approximation.IEEE Transactions on Evolutionary Computation,2006,10(4):405-420
    [76]Liu Bo,Wang Ling,Jin Yihui.An effective hybrid PSO-based algorithm for flow shop scheduling with limited buffers.Computers & Operations Research.2008,35(9):2791-2806
    [77]Jiao Bin,Lian Zhigang,Gu Xingsheng.A dynamic inertia weight particle swarm optimization algorithm.Chaos,Solitons & Fractals,2008,37(7):698-705
    [78]Wang Hu,Li Guangyao,Zhong Zhihua.Optimization of sheet metal forming processes by adaptive response surface based on intelligent sampling method.Journal of Materials Processing Technology,2008,197(3):77-88
    [79]Tan K C,Yang Y J,Goh C K.A Distributed Cooperative Coevolutionary Algorithm for Multiobjective Optimization.IEEE Transactions on Evolutionary Computation,2006,10(5):527-549
    [80]吴斌,傅伟鹏,郑毅等.一种基于群体智能的Web文档聚类算法.计算机研究与发展.2002,39(11):1429-1435
    [81]杨淑莹.模式识别与智能计算-Mathlab技术实现.北京:电子工业出版社,2008,1-347
    [82]Taplak H,Uzmay I,Yildirim S.An artificial neural network application to fault detection of a rotor bearing system.Industrial Lubrication and Tribology,2006,58(1):32-44
    [83]张国平,王正欧,袁国林.求解一类组合优化问题的混沌搜索法,系统工程理论与实践,2001,21(5):102-105
    [84]Feng Yong,Yu Xinghuo,Sun Lixia.Synchronization of uncertain chaotic systems using a single transmission channel.Chaos,Solitons & Fractals,2008,35(4):755-762
    [85]钟将.基于人工免疫的入侵分析技术研究:[博士学位论文].重庆:重庆大学,2005
    [86]丁永生.计算智能的新框架:生物网络结构.智能系统学报,2007,2(2):26-30
    [87]徐敏.基于数据挖掘的Web信息检索研究:[博士学位论文].南京:南京航空航天大学,2006
    [88]刘兵.时间序列与聚类挖掘相关技术研究:[博士学位论文].上海:复旦大学,2006
    [89]潘定,沈钧毅.时态数据挖掘的相似性发现技术.软件学报,2007,18(2):246-258
    [90]Laurent A,Teisseire M,Poncelet P.Fuzzy data mining for the semantic web:Building XML mediator schemas.Capturing Intelligence,2006,1:249-264
    [91]Choi I,Moon B,Kim H J.A clustering method based on path similarities of XML data.Data & Knowledge Engineering,2007,60(2):361-376
    [92]Wu Royu,Chan Jouming,WangYueli.A linear time algorithm for binary tree sequencees transformation using left-arm and right-arm rotations,Theoretical Computer Science,2006,355(3):303-314
    [93]Li Cuiping,Ooi Bengchin,Tung K H,etal.DADA:A Data Cube for Dominant Relationship Analysis.SIGMOD'06,ACM,2006,659-669
    [94]Li C,Chang E,Garcia-Molina H,etal.Clustering for approximate similarity search in highdimensional spaces.IEEE Transactions on Knowledge and Data Engineering,2002,14(4):792-808
    [95]Wang Haixun,Park S,Wei Fan,etal.ViST:A Dynamic Index Method for Querying XML Data by Tree Structures.In:SIGMOD 2003.Proc.ACM,2003,1-12
    [96]Abadi D J,Carney D,Etintemel U,etal.Aurora:A New Model and Architecture for Data Stream Management.The VLDB Journal,2003,12(2):120-139
    [97]Dong X,Halevy AY,Tatarinov I.Containment of nested XML queries.In:Nascimento MA,Ozsu MT,Kossmann D,Miller RJ,Blakeley JA,Schiefer KB,eds.Proc.of the 30th Int'l Conf.on Very Large Data Bases.San Francisco:Morgan Kaufmann Publishers,2004,132-143
    [98]Fong J,WongH K,Cheng Z.Converting relational database into XML documents with DOM[J].Information and Software Technology,2003,45:335-355.
    [99]王国仁,汤南,于亚新等.一种并行XML数据库分片策略.软件学报,2006,17(4):770-781
    [100]王国仁,乔百友,韩东红.基于分片的XML快速结构连接算法.计算机学报,2008,31(1):77-90
    [101]郭俊文,衡星辰,邵利平等.一种基于XML文档聚类的XML近似查询算法.计算机工程.2006,32(15):52-54
    [102]Richi Nayak,Wina Iryadi.XML schema clustering with semantic and hierarchical similarity measures.Knowledge-Based Systems,2007,20(6):336-349
    [103]Huynh T,Hon Wingkai,Lam Takwah.Approximate string matching using compressed suffix arrays.Theoretical Computer Science.2006,352(22):240-249
    [104]R(?) C,Sim(?)on J,Fern(?)ndez M F.A complete and efficient algebraic compiler for XQuery.In:Liu L,Reuter A,Whang KY,etal,eds.Proc.of the 22nd Int'l Conf.on Data Engineering(ICDE).Atlanta:IEEE Computer Society,2006,1-4
    [105]Zhang S H,Dyreson C.Symmetrically exploiting XML.In:Carr L,Roure DD,IyengarA,etal,eds.Proc.of the 15th Int'l Conf.on World Wide Web(WWW).Edinburgh:ACM Press,2006.103-111
    [106]Curtmola E,Amer-Yahia S,Brown P,Fern(?)ndez M.GalaTex:A conformant implementation of the XQuery FullText language[A].In:Florescu D,Pirahesh H,eds.Proc.of the 2nd Int'l Workshop on XQuery Implementation,Experience,and Perspectives (XIME-P).Baltimore:ACM Press,2005.1024-1025
    [107]Amer-Yahia S,Curtmola E,Deutsch A.Flexible and efficient XML search with complex full-text predicates.In:Proc.of the ACM SIGMOD Int'l Conf.on Management of Data(SIGMOD).Chicago:ACM Press,2006.575-586
    [108]孔令波,唐世渭,杨冬青等.XML数据的查询技术.软件学报.2007,18(6),1400-1418
    [109]严和平,刘兵,汪卫.XML查询的推理审计,计算机学报,2006,29(8),1308-1317
    [110]路燕,张亮,段起阳等.一种基于DTD的XML索引方法,计算机研究与发展,2005,24(1):30-37
    [111]郭志懋.XML数据的查询转换和集成:[博士学位论文].上海:复旦大学,2005
    [112]Liu S,McMahon C A,Culley S J.A review of structured document retrieval(SDR)technology to improve information access performance in engineering document management.Computers in Industry,2008,59(1):3-16
    [113]Hoffmann B,Janssens D,Eetvelde N V.Cloning and Expanding Graph Transformation Rules for Refactoring,Electronic Notes in Theoretical Computer Science,2006,152(12):53-67
    [114]Keeffe M,Cinneide M.Search-based refactoring for software maintenance,The Journal of Systems and Software,In Press,Corrected Proof,2007,1-15
    [115]Ambler S,Sadlage P著.王海鹏译.数据库重构.北京:机械工业出版社,2007,1-203
    [116]Wang C X,Liu Y S.X-Restore:Middleware for XML' s relational storage and retrieve.Wuhan University Joural of Natural Science,2003,8(1A):28-34
    [117]UCI.UCI KDD Archive.[2007-6-1].http://kdd.ics.uci.edu/summary.data. type.html
    [118]Eleni Mangina,John Kilbride.Utilizing vector space models for user modeling within e-learning environments.Computers & Education,2008,51(2):493-505
    [119]徐德智,吴敏.XML自动机的构造及实用化研究.计算机学报.2003,26(4):471-476
    [120]Carrasco R C,Rico-Juan J R.A similarity between probabilistic tree languages:application to XML document families.Pattern Recognition.2003,36(9):2197-2199
    [121]Tan Zijing,Zhang Zijun,Wang Wei,etal.Consistent data for inconsistent XML document.Information and Software Technology,2007,49(10):947-959
    [122]袁鼎荣,严小卫,陈宏朝.一个新的概率数据模型.计算机应用研究.2003,10:65-67
    [123]张忠平,余靖,朱扬勇.基于函数依赖的XML键的推理及其求解算法.计算机研究与发展,2004,41(10):388-396
    [124]Colin White.Data Warehousing:Cleaning and Transforming Data.InfoDB,2002,10(8):11-12
    [125]陈伟,丁秋林.一种XML相似重复数据的清理方法研究.北京航空航天大学学报,2004,30(9):835-838
    [126]Ge Hongwei,Liang Yanchun.A Multiple Sequence Alignment Algorithm Based on a Hidden Markov Model and Immune Particle Swarm Optimization,Journal of Computer Research and Development.2006,43(8):1330-1336
    [127]冯玉才,桂浩,李华等.数据分析和清理中相关算法研究.小型微型计算机系统,2005,26(6):1018-1022
    [128]叶舟,王东.基于规则引擎的数据清洗.计算机工程,2006,32(23):52-55.
    [129]刘芳,何飞.一种基于聚类树的增量式数据清洗算法.华中科技大学学报,2005,33(03):50-52
    [130]Pan Hui,Wang Ling,Liu Bo.Particle swarm optimization for function optimization in noisy environment.Applied Mathematics and Computation.2006.181(2):908-919
    [131]张凤林,刘思峰.基于模糊ISODATA-~*的关联规则聚类算法,南京航空航天大学,2005,37(3):391-394
    [132]Xiang Zheng,Zhang Yaiyi,Sun Jiancheng.Modeling of Nonlinear Systems Based on Recurrent Least Squares Support Vector Machines,Journal of System Simulation,2006,18(9):2684-2687
    [133]张敏,宋睿华,马少平.基于语义关系查询扩展的文档重构方法.计算机学报,2004,27(10):1395-1401
    [134]Wan Changxuan,Liu Yunsheng.X-RESTORE:Middleware for XML's Relational Storage and Retrieve.Wuhan University Journal of Natural Science,2003,8(1A):28-34

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700