用户名: 密码: 验证码:
基于Ontology的非结构化信息访问机制研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着网络技术日新月异的发展,网络上的数据量呈指数级激增,大多数信息已不再局限于传统的结构化形式,而是以诸如电子邮件、图像、网页、工作流等非结构化形式存在。如何采用统一的方法表示和访问这些非结构化信息,并从中归纳及获取知识是各组织机构实施信息化建设的核心,也是目前一个新兴的研究方向。
     XML作为数据表示和数据交换的新标准,具有统一的非结构化信息描述机制,但其在语义表达能力上存在不足,限制了语义异构环境下信息的表示、交换和共享。Ontology技术通过建立领域知识的概念模型,解决了XML语义差异问题,减少或消除概念及术语上的混乱,使得获取那些隐含或不明确的信息成为现实。利用Ontology可以给XML所表示的非结构化信息增加丰富的语义知识背景。
     本文提出了一套通用的基于Ontology的访问策略和实现方案作为对非结构化信息访问机制研究特别是语义层次访问的探索和尝试,主要包括下列理论及技术:
     1.结合Frame-Logic和SQL语言特点,提出一种新型类SQL数据操作语言Fl-Plus,支持各种数据访问操作;
     2.初步设计和实现推理引擎,用以完成推理规则和语义词典的解析映射工作,实现了语义级别信息访问的核心技术,推理引擎的引入帮助计算机识别文档信息的语义,完成智能访问;
     3.基于Schema生成的模式约束信息,约束各类信息访问操作,以最大程度保证底层数据的有效性和完整性;
     4.针对处理XML应用的瓶颈问题,结合路径优化、Ontology集合访问和JDOM缓冲三大技术,在一定程度上提高了系统处理性能;
     5.借鉴JDBC技术,设计了JXSC服务接口,为三层模式的信息访问方式提供支持。
     最后,笔者在参与湖北省教育厅科研资助项目——“基于XML的WEB存储系统研究”的研究开发过程中,以上述理论为指导,结合JAVA及XML技术,初步实现了本文所提出的OBSA-AM(Ontology-based Storage Architecture—Access Mechanism)访问系统模型。
Along with the fast development of the Network technology, information on the Network increases rapidly at the speed of exponential, and most information now presents as unstructured form such as emails, graphics, Web pages, workflows etc. instead of traditional structured form. How to utilize a uniform method to express and access this unstructured information, then acquire knowledge from it is each organization's information construct kernel and also is a new research direction at present.
    As a new standard for data express and exchange, XML has a uniform describing mechanism for unstructured information. But the shortage of semantic expression for XML restricts information show, exchange and share in the different environments. Fortunately, Ontology technique brings concept model in the domain knowledge to resolve XML semantic difference problems, reduce or eliminate confusion for concepts and terms, so as to get those hidden or ambiguous information becomes realization. There are adding really abundant semantic background knowledge for unstructured information based on XML with ontology.
    Supported by the relative theories above-mentioned, this thesis provides a uniform accessing strategy and realizing scheme based on ontology as access mechanism research for unstructured information. Mainly includes techniques and theories as follows:
    1. Combines Frame-Logic and SQL language's characteristics, bring forward a new type language (Fl-Plus) to support each data operation;
    2. Designs and realizes the inference engine concisely, in order to complete the parse mapping for logic inferential rules and semantic dictionary and to realize the kernel technique for semantic level information access, inference engine helps computer distinguish semantic from XML documents and complete intelligent access;
    3. Pattern restriction information based on schema can restrict all kinds of access operations, and hope to guarantee bottom data's validity as furthest as possible;
    4. Aim at the bottleneck problem in applications that is deal with XML, this thesis combines three techniques such as path optimizing, ontology set access and creating buffer based on JDOM to improve system performance at a certain extent;
    5. Using JDBC for reference, designing JXSC service interface provides support for three levels model access way.
    Finally, guided by the techniques and theories above-mentioned, this thesis brings forward the model during the project-Research of Web Storage Based on XML, which staked by the Hubei Provincial Department of Education organically, and primarily realizes the OBSA-AM (Ontology-based Storage Architecture - Access Mechanism) during the research by using JAVA and XML techniques.
引文
[1] 邓志鸿,唐世渭,张铭,Ontology 研究综述,北京大学学报(自然科学版),Vol.38,No.5,Sep,2002
    [2] 张晓林,Semantic Web 与基于语义的网络信息检索,情报学报,Vol.21,No.4,Aug.2002
    [3] 朱亮,Native XML 数据库技术,IBM DeveloperWorks,2003
    [4] D.Fensel, S.Decker, M.Erdmann, et al. Ontobroker: The Very High Idea [A]. In: Proceeding of the 11th International Flairs Conference (FLAIRS-98) [C]. Sanibel Island, Florida, USA, 1998. p131-135
    [5] J.Arpirez, A.G..Perez, A.Lozeno, et al. (Onto)~2agent: An Ontology-based WWW Broker to Select Ontologies [A]. In: Proceeding of the Worksgop on Application of Ontologies and Problem - Solving Methods[C]. A.Gomez - Perez and V.R Benjamins(eds), UK, 1998. p16-24
    [6] G.Wiederhold, SKC[EB/OL]. http://www-db.standford.edu/skc/, 2001, 2001-7-20
    [7] The Knowledge Grid. http://portal.acm.org/citation.cfm
    [8] 沈传宝,XML与Tamino 数据库系列讲座,软件世界,http://www.swm.com.cn
    [9] 吴广印,新兴的Internet数据库,微电脑世界(52),2001
    [10] Guarino N, Masolo C,Vetere G. OntoSeek:Content-Based Access to the Web. IEEE Intelligent Systems,1999,14(3):70~80
    [11] Shun S B,Motta E,Domingue J. ScholOnto:an Ontology-based Digital Library Server for Research Documents and Dis2 course. Intl J Digital Libraries,2000,3 (3):237~248
    [12] Kevin Dick, XML: A Manager's Guide, Second Edition, Person Education, Inc, 2003
    [13] Extensible Markup Languange (XML),http://www.w3.org/XML/
    [14] Introduction to DTD. http://www.w3schools.com/dtd/dtd_intro.asp
    [15] Datatypes for DTDs (DT4DTD) 1.0. http://www.w3.org/TR/2000/NOTE-dt4dtd-20000113
    [16] 朱麟,XML Schema 概述,中国XML联盟,1999.7,http://www.xml.org.cn:8188/resource/article/ZhulnSchema.htm
    [17] XML Schemas. http://www.w3.org/TR/xmlschema/
    [18] Document Object Model (DOM) Level 1 Specification version 1.0. http://www.w3.org/TR/REC-DOM-Level-1/
    [19] Tim Bray, Jean Paoli, Extensible Markup Language(XML)1.0(third Edition). W3C Recommendation 04 February 2004, http://www.w3.org/TR/2004/REC-xml-20040204/
    [20] XML white paper. Microsoft Corporation, June 23, 1997
    [21] XSL Transformations (XSLT) Version 1.0. http://www.w3.org/TR/xslt
    [22] 瞿裕忠,一个基于XML的数据交换原型系统,计算机工程,2000,26(9):35-37
    [23] Lear, A.C., XML seen as integral to application integration, IT Professional, 1999, Volume: 15, No 9-10, p12 -16
    [24] Microsoft Staff, XML: Enabling Next-Generation Web Applications, Microsoft White Paper, 1998.4
    [25] Neches R,Fikes R E,Gruber T R,et al. Enabling Technology for Knowledge Sharing. AIMagazine,1991,12(3):36~56
    
    
    [26] Mike Uschold. 1998. Knowledge level modelling: concepts and terminology. The Knowledge Engineering Review, Vol. 13:1, 1998, 5-29
    [27] Michael Erdmann, Rudi Studer, Ontologies as Conceptual Models for XML Documents, http://sern.ucalgary.ca/KSI/KAW/KAW99/papers/Erdmannl/erdmann.pdf
    [28] Guarino N. Semantic Matching: Formal Ontological Distinctions for Information Organization, Extraction,and Integration. In:Pazienza M T,eds. Information Extraction:A Multidisciplinary Approach to an Emerging Information Technology,Springer Verlag, 1997,139~170
    [29] B. Chandrasekaran, J.R. Josephson, and V.R. Benjamins, 1999. What Are Ontologies, and Why Do We Need Them? 1999 Jan/Feb: 20-25
    [30] Mike Uschold, Michael Gruninger. Ontologies: Principles,Methods and Applications. In:Knowledge Engineering Review Vol Ⅱ(2), June 1996
    [31] 邓志鸿,唐世渭,杨冬青,面向语义集成——本体在Web信息集成中的研究发展,计算机应用,Vol.22,No.1,Jan.2002
    [32] Farquhar A., Fikes, R. et al. The ontolingua server: a tool for collaborative ontology construction. International Journal of Human Computer Studies, 1997(46):707-728
    [33] Michael Kifer, Georg Lausen, James Wu, Logical Foundations of Object-Oriented and Frame-Based Languages, Journal of the Association for Computing Machinery, May 1995
    [34] Stefan Decker, Michael Erdmann, Dieter Fensel, ONTOBROKER: Ontology based Access to Distributed and Semi-Structured Information, in: R.Meersman et al.(eds.): Semantic Issues in Multimedia Systems, Kluwer Academic Publiser, Boston 1999
    [35] Michael, Erdmann. How to structure and access XML with ontologies. Data & Knowledge Engineering, 2001 (36):317-335
    [36] James Clark, Steve DeRose, XML Path Language(XPath) Version1.0, W3C Recommendation 16 November 1999, http://www.w3.org/TR/xpath
    [37] Denise Draper, Peter Fankhauser, XQuery1.0 and XPath2.0 Formal Semantics, W3C Working Draft 20 February 2004, http://www.w3.org/TR/2004/WD-xquery-semantics-20040220/
    [38] Scott Boag, Don Chamberlin, XQuery 1.0: An XML Query Language, W3C Recommendation 12 November 2003, http://www.w3.org/TR/xquery/
    [39] 孙登峰,面向XML文档的概念检索技术,计算机应用,Vol.23,No.1,Jan.2003
    [40] 王德禄,知识管理的IT实现——朴素的知识管理,电子工业出版社,2003
    [41] 吕建华,王国仁,于戈,XML 数据的路径表达式查询优化技术,软件学报,Vol.14,No.9,2003
    [42] Jianhua Lv, Guoren Wang, Jeffrey X.Yu, Performance Evaluation of a DOM-Based XML Database: Storage, Indexing, and Query Optimization, In: Meng XF, Su JW, Wang YJ(Eds): WAIM 2002, LNCS 2419, pp. 13-24,2002. Spring-Verlag Berlin Heidelberg 2002
    [43] Jing Wang, Xiaofeng Meng, Shah Wang, Integrating Path Index with Value Index for XML Data, X.Zhou, Y.Zhang, and M.E. Orlowska(Eds.): APWeb 2003, LNCS 2642,pp.95-100,2003. Springer-Verlag Berlin Heidelberg 2003
    [44] Roy Goldman, Jennifer Widom, DataGuides:Enabling Query Formulation and Optimization in
    
    Semi-structured Databases. Proceeding of the 23rd VLDB Conference Athens. Greece. 1997
    [45] 张晓林,Semantic Web 与基于语义的网络信息检索,情报学报,Vol.21,No.4,August,2002.
    [46] JDOM AND XML PARSING, http://www.jdom.org/
    [47] Easy Java/XML integration with JDOM, http://www.javaworld.com/javaworld/jw-05-2000/jw-0518-jdom.html
    [48] S.Abiteboul, Query semi-structured data, In Proceedings of the 6th International Conference on Database Theory, pages 1-18, Jan. 1997
    [49] 邓志鸿,唐世渭,杨冬青,基于XML的本体表示和检索技术的研究,计算机工程与应用,Vol.14,No.2.May 2002
    [50] 廖明宏,本体论与信息检索,计算机工程,Vol26,No.2,Feb.2000
    [51] Michael Erdmann, Stefan Decker, Ontology-aware XML-Quedes,Submission for WebDB 2000
    [52] Guarino N, Formal Ontology and Infromation Systems, Proc. OfFOIS'98, Trento, Italy, Aug. 1998
    [53] Jurgen Frohn, Georg Lausen, Heinz Uphoff, Access to Objects by Path Expressions and Rules, Proceeding of the 20th VLDB Conference Santiago, Chile,1994
    [54] M. F. Fernandez and D.Suciu. Optimizing regular path expressions using graph schemas. In Proceedings of the 14th International Conference on Data Engineering, pages 14-23, Feb. 1998
    [55] Jon Ellis, Linda Ho, JDBCTM3.0 Specification(Final Release),Sun Microsystmes,Inc. October 2001

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700