基于语义的网络知识获取相关技术研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

NSTL服务站

基于语义的网络知识获取相关技术研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：The Research on Web Knowledge Acquirement Based on Semantic Techniques
作者：郭勇
论文级别：博士
学科专业名称：管理科学与工程
中文关键词：概念语义 ; NMF ; 文本分类 ; 信息抽取 ; 用户模板构造 ; 近似查询
英文关键词：concept semantics ; NMF ; text classification ; information extraction ; user profile construction ; approximate query
学位年度：2007
导师：张维明
学科代码：1201
学位授予单位：国防科学技术大学
论文提交日期：2007-10-01

摘要

伴随着Internet的飞速发展,Web上出现了海量、异构、半结构化、动态的信息资源,并且在这些Web信息中有80%以上的信息是以Web文本的形式存在的。如何从这些浩如烟海的Web信息资源中寻找并获取有价值的信息和知识模式,已经成为信息处理领域一个亟待解决的问题。基于语义的网络知识获取有助于解决上述问题,它可以提高用户网上信息搜索的效率,可以将搜索结果分门别类,帮助用户快速定位目标知识,并且从中抽取有价值的知识。
     本文在分析网络知识获取相关技术的研究现状和存在问题的基础上,研究了概念语义生成技术、文本分类方法、典型用户会话模板生成方法以及基于概念的近似查询技术,主要取得以下研究成果:
     (1)借助NMF算法的分解结果具有实现上的简便性以及分解形式和分解结果可解释性的优点,提出一种基于NMF的概念语义生成方法。类比图像分解的思想,将一个向量文本对应一幅图像,一个特征项数值对应一个象素点灰度值,应用NMF提取文本向量的概念语义,从而为大规模文本处理提供了一种新途径。实验结果以及相关工作比较分析表明NMF生成的概念语义能准确反映样本的局部特征,有助于解决自然语言表示中固有的歧义问题。
     (2)将NMF生成的概念语义向量用于Web文本分类。由于NMF生成的局部概念语义向量能和样本的特征直接对应,体现了各个分类中文本各自的特点,因此比体现所有文本共同特征的全局概念语义向量具有更强的区分能力。实验对比分析了局部概念语义空间和全局概念语义空间的构建对文本分类结果的影响,实验结果表明在NMF生成的局部概念语义空间中进行分类更精确。
     (3)根据NMF算法分解大规模文本矩阵的特点,提出了一种基于NMF的典型用户会话模板发现方法。应用NMF算法分解项.文本矩阵来获取项之间的相关性,在此基础上,引入语义向量和权重向量的概念,并通过定义语义向量的类别紧密度来提取用户模板。从确保概念语义向量正交,减少概念语义向量冗余的角度出发,选择NMF的变体LNMF进行降维,设计了一种基于LNMF的典型用户会话模板提取算法。由于LNMF得到的概念语义向量是尽可能正交的,实验分析表明,LNMF方法的聚类效果好,适合于发现典型用户会话模板。
     (4)针对基于概念最小上界和最大下界求本体概念近似查询的不足,定义了概念的最佳近似。利用复杂概念间的蕴涵关系,引入多元界和最简多元界的概念。通过相关性质和定理证明了借助多元界可以求得概念的最佳近似,从而将求概念最佳近似的问题转化为求概念的最简多元界问题。在此基础上,提出基于概念最简多元界的本体概念近似查询方法,可以有效消除查询重写冗余,提高近似查询的质量和查询重写效率。
     (5)给出了一个求概念最简多元最小上界的算法。详细讨论利用迭代递增的过程和概念层次减少搜索空间、优化算法效率的措施,给出算法正确性和完备性的证明,分析了算法的有效性。
Along with the rapid development of Internet, there are abundant, isomeric, semi-structured and dynamic information resources on Web. Among these Web information, above 80 percent exist in the form of Web text. How to seek and gain the valuable information and knowledge model from these vast Web information resources, have already become the question urgently awaited to be solved in the information processing domain.The questions mentioned above can be resolved effectively by Web knowledge acquiration. It can classify search results, which not only enhances the efficiency of search for Web users, but also improves the ability of localization to goal knowledge, and extracts the valuable knowledge.
     On basis of analyzing the present research situation and existing question of Web knowledge acquisition, this dissertation mainly studies the essential technologies of concept semantic generation, the common text classification methods, user profile construction and approximate query technique based on concept. The main research works are shown as follows.
     (1) With the aid of realizes on simple, explainable metrics from the NMF algorithm's decomposition result, a concept semantic generation method is proposed. In analogy with image decompotion, the NMF is applied to extract the concept semantics from text vector, providing one new way for the large-scale text processing. The experimental results as well as the related work comparison indicate that the concept semantics from the application of NMF can reflect accurately the partial characteristic of the sample, which help to solve the natural language expression problem.
     (2) The mechanism of text callasification based on NMF is studied. The local concept semantics vector from NMF has stronger clssification capacity than that of global concept semantics, because the fromer can correspond directly with the sample characteristic, which manifests each classified text respective characteristic. Experiment to compare the influence of local concept semantics space and the global concept semantics space construction to the text classification result is conducted. The experiment results indicate that the classification in the local concept semantics space by NMF is most precise.
     (3) Taking advantage of the decomposion efficiency of the large-scale text matrix by NMF, a method based on NMF for construction typical user conversation profile is presented. According to NMF, the term-text matrix is decomposed to capture the relations between terms. Then, the concepts of semantic vectors and weight vectors are introduced. Futhermore, the the class closeness degree is defined to extract the user profile. From the point of guaranting the concept semantics vector orthogonal, reducing the concept semantics vector redundancy, LNMF is carried on the dimensionality reduction. Because LNMF obtains the concept semantics vector is as far as possible orthogonal, the experiment result shows the LNMF method not only improve filtering precision markedly, but also has the merits of aggregation
     (4) To deal with query reformulation, an ontology concept approximate query method based on most concise multi-dimensional concept is proposed. Firstly, the most approximate concept is defined. Using the implication relations between the complex concepts, the multi-dimensional and the most concise multi-dimensional concept are defined, which makes it possible to obtain the most approximate concept from the multi-dimensional concept. So the question to get most approximate concept is transformed to get the most concise multi-dimensional concept. Related properties and theorems show that the method can reduce the query reformulation redundancy effectively and improve the approximate query quality and efficiency.
     (5) An algorithm to get the most concise multi-dimensional least upper concept is proposed. The detailed procedure and method to reduce search space and improve efficiency are discussed. Last but not the least, the algorithm accuracy and completeness is proved.

引文

[1]Lee T B.Weaving the Web.San Francisco,USA:Harper,1999.
    [2]Tim Berners-Lee,Hendler J,Lassila O.The Semantic Web.Scientific American,2001,284(5):35-43.
    [3]李国辉,汤大权,武德峰.信息组织与检索.北京:科学出版社,2003.
    [4]Haruechaiyasak C.A data mining and Semantic Web framework for building a Web-based recommender system.Ph.D.Dissertation,2003.
    [5]Belkin N J,Croft W B.Information filtering and information retrieval:two sides of the same coin.Communication of ACM.1992,35(12):29-38
    [6]Pazzani M,Billsus D.Learning and revising user profiles:the identification of interesting Web sites.Machine Learning,1997,27(3):313-331
    [7]Lausen H,Ding Y,Stollberg M,Fensel D,Hernandez R L,Han S K.Semantic web portals:state-of-the-art survey.Journal of Knowledge Management,2005,9(5):40-49.
    [8]Laender A,Ribeiro N B,Silva A.A brief survey of web data extraction tool.SIGMOD Record,2002,31(2):84-93.
    [9]CaliffM,Mooney R.Relational Learning of pattern-match rules for information extraction.In Proceedings of the sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence.Orlando,Florida,1999.
    [10]Freitag D.Machine learning for information extraction in informal domains.Machine Learning,2000,39(2/3):160-202.
    [11]Soderland S.Leaming information extraction rules for semi-structured and free text.Machie Learning,1999,34(1-3):233-272.
    [12]Muslea I,Minton S,Knolock C.Hierarchical wrapper induction for semistructured information sources.Autonomous Agents and Multi-Agent Systems,2001,4(1/2):93-114.
    [13]Hsu C N,Dung M.Generation finite-state transducers for semi-structured data extraction from t he Web.Information System,1998,23(8):521-538.
    [14]Kushme Rick N.Wrapper induction:efficiency and expressiveness.Artificial Intelligence Journal,2000,118(1/2):15-68.
    [15]Robert B,Sergio F,Geprg G.Supervised wrapper generation with lixto.Proceedings of 27 th international Conference on very Large Database,Roma,Italy,2001.
    [16]Robert B,Sergio F,Geprg G.Visual Web information extraction with lixto. Proceedings of 27 th international Conference on very Large Database,Roma,Italy,2001.
    [17]Liu L,Pu C,Han W.XWRAP:An XML-based wrapper generator construction system for Web information sources.In Proceedings of the international conference on data Engineering,San Diego,2000.
    [18]Liu L,Han W,Buttler D,et al.An XML-Based wrapper generator for web information extraction.In proceedings of ACM SGMOD International Conference on Mangement of Data,Philadelphia,Pennsylvania,USA,1999.
    [19]Valter C,Giansalvatore M.RoaedlRunner:towards automatic data extraction from large Web site.In Proceedings of the 27 th international conference on very Large Database.Roma,Italy,2001.
    [20]Arnaud S,Fabien A.Building intelligent Web application using light weight wrappers.Data Knowledge Engineering,2001,36(3):283-316.
    [21]Arocena G,Mendelzon A.WebOQL:Restructuring documents,database and webs.In Proceedings of the 14th ICDE Conference,Orlando,Florida,USA,1998.
    [22]Gustavo A.WebOQL:Exploiting document structure in Web queries.Toronto:Master's thesis,University of Toronto,1997.
    [23]徐林昊,杨文柱,陈绍飞.基于Xpath的Web信息抽取.19届全国数据库会议,郑州,2002.
    [24]杨文柱,徐林昊,赫亚南.个性化的Web查询助手的设计和实现.19届全国数据库会议,郑州,2002.
    [25]Xquery.http://www.w3.orgITR/xquery.
    [26]Courant R H.Methods of Mathematical Physics.Vol 1,1953.
    [27]Salton G,Wong A,Yang C S.A Vector Space Model for Automatic Indexing.Communications of ACM,1975.
    [28]Ralph M S,George W R.Principles of Information Systems,A Managerial Approach.
    [29]侯汉清.分类法的发展趋势简论,北京:中国人民大学出版社,1981
    [30]张滨.中文文档分类技术研究.硕士学位论文.武汉大学,2004.
    [31]李晓黎,刘继敏,史忠植.概念推理网及其在文本分类中的应用.计算机研究与发展,2000,37(9)
    [32]范焱,陈恩红.超文本协调分类器的性能研究.计算机研究与发展,2000,27(9):1026-1031.
    [33]黄萱菁,吴立德.独立于语种的文本分类方法.中文信息学报,2000,14(6):50-57.
    [34]刁倩,王永成,张惠惠.文本自动分类中的词权重与分类算法.中文信息学报,2000,14(3):25-29.
    [35]崔彩霞.基于支持向量机的文本分类方法研究.硕士学位论文,山西大学,2005.
    [36]李蓉,叶世伟,史忠植.SVM-KNN分类器-一种提高SVM分类精度的新方法.电子学报,2002,30(5):745-748.
    [37]李波,李新军.一种基于粗糙集和支持向f机的混合分类算法.计算机应用,2004,24(3):65-70.
    [38]张海燕.基于分词的中文文本自动分类研究与实现.硕士学位论文,湖南大学,2002.
    [39]关英春,秦蓓.汉字自动统计系统CWSS.计算机研究与发展,1985,12:6-11.
    [40]黄昌宁.中文信息处理中的分词问题.语言文字应用,1997,1
    [41]姚天顺,张桂平.基于规则的汉语自动分词系统.中文信息学报,1990,4(1):37-41.
    [42]王锡江,王启祥,陈家骏.基于邻接知识的汉语自动分词系统.计算机研究与发展,1992,29(11):54-58.
    [43]中文自然语言处理平台.http://www.nlp.org.cn/.
    [44]贝雨馨,崔荣一.文本分类中特征项权重的计算方法.延边大学学报(自然科学版),2004,30(3):202-204
    [45]徐凤亚,罗振声.文本自动分类中特征权重算法的改进研究.计算机工程与应用,2005,1:181-184
    [46]吴科,石冰,卢军等.基于文本集密度的特征选择与权重计算方案.中文信息学报,2004,18(1):42-47
    [47]鲁松,李晓黎,白硕等.文档中词语权重计算方法的改进.中文信息学报,2001,14(6)
    [48]刁倩,王永成,张惠惠等.VSM中词权重的信息摘算法.情报学报,2000,19(4):354-358
    [49]Yang Y,Wilbur J.Using Corpus Statistics to Remove Redundant Words in Text Categorization.Information Science,1996
    [50]Pedersen J.A Comparative Study on Feature Selection in Text Categorization.KDD-2000 Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2000
    [51]Galavotti L,Sebastiani F,Simi M.Feature Selection and Negative Evidence in Automated Text Categorization.KDD-2000 Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2000
    [52]高洁,吉根林.文本分类技术研究.计算机应用研究,2004(7):165-169
    [53]宋枫溪.自动文本分类若干基本问题研究.博士学位论文.南京理工大学, 2004.
    [54]Koller D,Sahami M.Toward optimal feature selection.In:Proceedings of Int'l Conf.on Machine Learning.Bari,1996.284-292.
    [55]Sahami M.Using machine learning to improve information access.Ph.d.thesis,Computer Science Department,Stanford University.1999.
    [56]李凡,鲁明羽,陆玉昌.关于文本特征抽取新方法的研究.清华大学学报(自然科学版),2001,41(7):98-101.
    [57]Singh M,Provan G.Efficient learning of selective Bayesian network classifiers.Technical Report MS-CIS-95-36.Computer and Information Science Department,University of Pennsylvania,Philadelphia,PA.1995.
    [58]Singh M,Provan G M.Efficient learning of selective bayesian network classifiers.In:Proceedings of the 13th International Conference on Machine Learning.Bari,1996.453-461.
    [59]Weston J,Mukherjee S,Chapelle O.Feature selection for SVMs.Advances in Neural Information Processing Systems 13.MIT Press,2001.
    [60]Lawrence S,Sam R.An introduction to locally linear embedding.http://www.cs.to-ronto.edu/～roweis/lle/papers/lleintro.pdf
    [61]Roweis S T,Saul L K.Nonlinear dimensionality reduction by locally linear embedding.Science,2000,290:2323-2326.
    [62]Tishby N,Pereira F C,Bialek W.The information bottleneck method.In:Proceedings of the 37th Annual Allerton Conference on Communication,Control,and Computing.Urbana,1999.368-377.
    [63]Fernandez-Breis J T,Martínez-Béjar R.A Cooperative Framework for Integrating Ontologies.International Journal of Human-Computer Studies,2002,56(6):662-717.
    [64]Maedche A,Motik B,Silva N,Volz R.MAFRA-A Mapping FRAmework for Distributed Ontologies.Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management,235-250,Sig(u|¨)enza,Spain,2002.
    [65]Diego Calvanese,Giuseppe De Giacomo,Maurizio Lenzerini.Ontology of integration and integration of ontologies.In Proc.of the 2001 Description Logic Workshop(DL 2001),2001.10-19.
    [66]Madhavan J,Bernstein P,Domingos P,Halevy A Y.Representing and Reasoning about Mappings between Multiple Domain Models.In Proceedings of the AAAI Conference,2002.
    [67]Kiryakov A,Simov K,Dimitrov M.OntoMap:Portal for Upper-Level Ontologies.In Proceedings of the 2nd International Conference on Formal Ontology in Information Systems(FOIS'01),Ogunquit,Maine,USA,October 2001.
    [68]Kent R.The information flow foundation for conceptual knowledge organization.In proceedings of 6th International Conference of the International Society for Knowledge Organization,July,2000.
    [69]黄烟波,张红宇等.本体映射方法研究.计算机工程与应用,2005,18.
    [70]Diogene's Ontology Mapping Prototype.http://diogene.cis.strath.ac.uk/.
    [71]Maedche A,Staab S.Measming Similarity between Ontologies.In:Proceedings of the European Conference on Knowledge Acquisition and Management EKAW-2002,2002.
    [72]Sekine S,Sudo K,Ogino T.Statistical Matching of Two Ontologies.In:the Proceedings of the SIGLEX99:Standerdizing Lexical Resources,Maryland,USA,1999:69-73.
    [73]Doan A H,Madhavan J,Domingos P.Learning to map between ontologies on the semantic web.In:Proceedings of the Eleventh International World Wide Web Conference,Honolulu,Hawaii,USA,2002.
    [74]Rodrfguez A,Egenhofer M.Determining Semantic Similarity Among Entity Classes from Different Ontologies.IEEE Transactions on Knowledge and Data Engineering,2003,15(2):442-456.
    [75]Ehrig M,Sure Y.Ontology Mapping-An Integrated Approach.In:Proceedings of the 1st European Semantic Web Symposium,Heraklion,Greece,Springer,LNCS,2004-05:10-12.
    [76]Prasad S et al.A Tool For Mapping Between Two Ontologies Using Explicit Information.In:Proceedings of AAMAS 2002 Workshop on Ontologies and Agent Systems,2002,7.
    [77]Ehrig M,Staab S.QOM-Quick Ontology Mapping.In:ISWC 2004,LNCS 3298,2004:683-697.
    [78]Lee D,Seung H.Learning the parts of objects by non-negative matrix factorization.Nature,1999,401:788-791.
    [79]Lawrence S,Sam R.An introduction to locally linear embedding.http://www.cs.to-ronto.edu/～roweis/lle/papers/lleintro.pdf
    [80]Roweis S T,Saul L K.Nonlinear dimensionality reduction by locally linear embedding.Science,2000,290:2323-2326.
    [81]Li S,Hou X,Zhang H.Learning spatially localized,parts-based representation.In:Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2001.Hawaii,USA,2001,207-212.
    [82]Guillamet D,Bressan M,Vitria J.A Weighted Non-Negative Matrix Factorization for Local Representations.In:Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2001.Hawaii,USA,2001.942-947.
    [83]Guillamet D,Vitria J.Non-negative Matrix Factorization for Face Recognition.In:Proceedings of the 5th Catalonian Conference on AI:Topics in Artificial Intelligence.Castellrn,Spain,2002.336-344.
    [84]Xu W,Liu X,Gong Y.Document clustering based on non-negative matrix factorization In:Proceedings of the 26th annual international ACM SIGIR conference.Toronto,Canada,2003.267-273.
    [85]Lu J,Xu B,Jiang J,Kang D.Non-negative matrix factorization for filtering Chinese document.In:Proceedings of the International Conference on Computational Science 2004.Krakow,Poland,2004.2:115-122.
    [86]Srebro N,Jaakkola T.Sparse Matrix Factorization of Gene Expression Data.MIT Artificial Intelligence Lab,2001.Http://www.ai.mit.edu/research/abstracts/abstracts2001/genomics/01srebro.pdf.
    [87]Li S,Hou X,Zhang H.Learning spatially localized,parts-based representation.In:Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2001.Hawaii,USA,2001,207-212.
    [88]徐波,孙茂松,靳光瑾.中文信息处理若干重要问题.科学出版社,2003.
    [89]Shah U,Finin T,Joshi A.Information retrieval on the Semantic Web.In:Proceedings of the 10th International Conference on Information and Knowledge Management.McLean,2002.461-468.
    [90]徐宝文,张卫丰.搜索引与信息获取技术.北京:清华大学出版社,2003.
    [91]Schatz B R,Chen H.Digital libraries:technological advances and social impacts(Guest Editors' Introduction).IEEE Computer,1999,32(2):45-50.
    [92]Chen H,Houston A L,Sewell R R,Schatz B R.Internet Browsing and Searching:User Evaluation of category Map and Concept Space Techniques.Journal of the American Society for Information Science,1998,49(7):582-603.
    [93]Chen H,Schatz B R,Yim T,Fye D.Automatic thesuarus generation for an electronic community system.Journal of the American Society for Information Science,1995,46(3):175-193.
    [94]Chen H,Schuffels C,Orwig R.Internet categorization and search:A machine learning approach.Journal of Visual Communication and Image Representation,Special Issue on Digital Libraries,1996,7(1):88-102.
    [95]Fu Weipeng,Wu Bin,He Qing,Shi Zhongzhi.Text document clustering and the space of concept on text document automatically generated.International Conferences on Info-tech and Info-net PROCEEDINGS.Beijing, 2001.107-112.
    [96]李源,郑毅,何清,史忠植.基于概念空间的文本语义索引.计算机科学,2002,29(1):20-22.
    [97]宫秀军,史忠植.基于贝叶斯潜在语义分析的半监督Web挖掘.软件学报,2002,13(8):1508-1514.
    [98]Schatz B R.Information retrieval in digital libraries:bringing search to the Net.SCIENCE,1997,275(17):327-334.
    [99]Deerwester S,Dumais S T,Landauer T K.Indexing by latent semantic analysis.Journal of the Society for Information Science,1990,41(6):391-407.
    [100]Papadimitriou C,Raghavan H P.Latent semantic indexing:a probabilistic analysis.In:Proceedings of PODS'98.Seattle,1998.159-168.
    [101]Hofmann T.Unsupervised learning by probabilistic latent semantic analysis.Machine Learning,2001,42:177-196.
    [102]Karypis G,Han E H.Concept indexing:A fast dimensionality reduction algorithm with applications to document retrieval & categorization.Technical Report TR-00-016,Department of Computer Science,University of Minnesota,2000.
    [103]Inderjit S D,Dharmendra S M.Concept decompositions for large sparse text using clustering.Machine Leaming,2001,42(1):143-175.
    [104]Devaney M,Ram A.Efficient feature selection in conceptual clustering.In:Proceedings of the Fourteenth International Conference on Machine Learning.Nashville,1997.92-97.
    [105]Lee D D,Seung H S.Algorithms for non-negative matrix factorization.Advances in Neural Information Processing Systems,2001,13:556-562.
    [106]Baowen Xu,Jianjiang Lu,Gangshi Huang.A constrained non-negative matrix factorization in information retrieval.In:Proceedings of the 2003 IEEE International Conference on Information Reuse and Integration.Nevada,USA,2003.273- 277.
    [107]于秀林,任雪松.多元统计分析.北京:中国统计出版社.1999.
    [108]Sebastiani F.Machine learning in automated text categorization.ACM Computing Surveys,2002,34(1):1-47.
    [109]Yang Y,Chute C G.An example-based mapping method for text categorization and retrieval.ACM Transactions on Information Systems,1994,12(3):252-277.
    [110]湛燕.K-近邻、K-均值及其在文本分类中的应用.硕士学位论文.河北大学,2003.
    [111]杨昂.文本分类算法研究.硕士学位论文.湖南大学,2002.
    [112]庞剑锋,卜东波,白硕.基于向量空间模型的文本自动分类系统的研究与实现.计算机应用研究,2001,(9):23-26.
    [113]Andrew M,Kamal N.A comparison of event models for naive bayes text categorization.AAAI-98 Workshop on Learning for Text Categorization,1998,129-138.
    [114]Joachims T.Text Categorization with Support Vector Machines:Learning with Many Relevant Features.Machine Learning,1998,54-65.
    [115]Inderjit S D,Dharmendra S M.Concept decompositions for large sparse text using clustering.Machine Learning,2001,42(1):143-175.
    [116]Hanm J,Kamber M.Data mining:concepts and techniques.Morgan Kaufmann Publishers,2000.
    [117]Cooley R,Mobasher B,Srivastava J.Web mining:information and pattern discovery on the World Wide Web.In:Proceedings of the International Conference on Tools with Artificial Intelligence.Newport,1997.558-567.
    [118]Chen M S,Park J S,Yu P S.Efficient data mining for path traversal patterns.IEEE Trans.on Knowledge and Data Engineering,1998,10(2):209-221.
    [119]Berendt B,Spyliopoulou M.Analysis of navigation behaviour in web sites integrating multiple information systems.VLDB Journal,2000,9:56-75.
    [120]Nasraoui O,Frigui H,Krishnapuram R.Extracting Web user profiles using relational competitive fuzzy clustering.Internatiol Journal on Artifical Intelligence Tools,2000,9(4):509-526.
    [121]Nasraoui O,Krishnapuram R,Joshi A.Mining Web access logs using relational clustering algorithm based on a robust estimator[A].In:Proceedings of the NAFIPS Conference.New York,1999.705-709.
    [122]Nasraoui O,Krishnapuram R.One step evolutionary mining of context sensitive associations and Web navigation patterns.In:Proceedings of the Second SIAM International Conference on Data Mining.Arlington,2002.531-547.
    [123]Hinneburg A,Aggarwal C C,Keim D A.What is the nearest neighbor in high dimensional spaces? In:Proceedings of the VLDB Conference.Cario,2000.506-515.
    [124]HananilU,Shapiral B,Shoval P.Information Filtering:Overview of Issues,Research and Systems.User Modeling and User-Adapted Interaction.2001,11(3):1573-1391.
    [125]http://www.newsedge.com
    [126]Tak W Y,Hector G M.SIFT-A tool for wide-area information dissemination.In:Proc of the 1995 USENIX Technical Conf.1995.177-186.
    [127] Morita M, Shinoda Y. Information filtering based on user behavior analysis and best match text retrieval. In :Proc. of the Seventeenth Annual Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1994. 272—281.
    [128] Konstan J, Milter B, Maltz D et al.GroupLen: Collaborative filtering for Usenet news. Communications of the ACM, 1997,40(3): 77 -87.
    [129] De Bra P, Houben G J, Wu H. Aham: A dexter-based reference model for adaptive hyper-media. In: Proceedings of the ACM Conference on Hypertext and Hypermedia, Darmstadt, Germany, 1999.147-156.
    [130] Lang K. Newsweeder: Learning to filter netnews. In: Intl Conf on Mach ine Learning (ICML), California, 1995. 331-339.
    [131] Mostafa J, Lam S W, Palakal M. A Multilevel Approach to Intelligent Information Filtering: Model, System, and Evaluation. ACM Transactions on Information Systems, 1997.

    [132] Salton G. Introduction to modern information retrieval. McGraw-Hill, 1983.
    [133] Kalfoglou Y, Schorlemmer M. Ontology mapping: the state of the art. The Knowledge Engineering Review, 2003,18(1): 1-31.
    [134] Wache H, Vogele T, Visser U. Ontology-based integration of information-a survey of existing approaches. In: Proceedings of the IJCAI Workshop on Ontologies and Information Sharing. California, 2001. 108-117.
    [135] Calvanese D, Giacomo G, Lenzerini M. Description logic framework for information integration. In: Proceedings of the 6th International Conference on the Principles of Knowledge Representation and Reasoning. Trento, 1998.2-13.
    [ 136] Goasdoue F, Rousset M C. Answering queries using Views: a KRDB perspective for the Semantic Web. ACM Transactions on Internet Technology,2004,4(3): 255-288.
    [137] Papakonstantinou Y, Gupta A, Haas L. Capabilities-based query rewriting in mediator systems. In: Proceedings of the 4th International Conference on Parallel and Distributed Information Systems. Miami Beach, 1996. 170-183.
    [138] Wiederhold G. Mediators in the architecture of future information systems.IEEE Computer, 1992,25(3): 38-49.
    [139] Chang K C-C, Garcia M H. Mind your vocabulary: query mapping across heterogeneous information sources. In: Proceedings of the 1999 ACM SIGMOD Conference. New York, 1999. 335-346.
    [140] Stuckenschmidt H. Approximate information filtering with multiple classification hierarchies. International Journal of Computational Intelligence and Applications, 2002, 2(3): 295-302.
    [141] Preece A, Hui K, Gray P. Kraft: an agent architecture for knowledge fusion. International Journal of Cooperative Information Systems, 1999,10(1-2):171-195.
    [142] Mena E, Illarramendi A, Kashyap V, Sheth A. OBSERVER: An approach for query processing in global information systems based on interoperation across preexisting ontologies. International journal on Distributed and Parallel Databases, 2000, 8(2): 223-271.
    [143] Calvanese D, Giacomo G, Lenzerini M. A framework for ontology integration.In: Proceedings of the First Semantic Web Working Symposium, Stanford University, California, 2001. 303-316.
    [144] Calvanese D, Giacomo G, Lembo D. What to ask to a peer: ontology-based query reformulation. In: Proceedings of the 9th International Conference on Principles of Knowledge Representation and Reasoning. Whistler, 2004.469-478.
    [145] Chang K C-C, Garcia M H. Approximate query mapping: accounting for translation closeness. The VLDB Journal, 2001, 10(2-3): 155-181.
    [146] Akahani J, Hiramatsu K, Satoh T. Approximate query reformulation based on hierarchical ontology mapping. In: International Workshop on Semantic Web Foundations and Application Technologies. Nara, 2003. 43-46.
    [147] Tzitzikas Y. Collaborative ontology-based information indexing and retrieval. Doctoral Dissertation, Department of Computer Science, University of Crete,Heraklion, 2002.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700