用户名: 密码: 验证码:
基于访问日志的自适应站点的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
目前Web数据挖掘的研究有很大一部分集中在Web日志挖掘上。Web站点的日志记录了Web站点浏览者的所有动作,通过对这些记录进行挖掘,找出浏览者的喜好,对Web站点的优化有指导意义。现有的Web站点都是以“网页”为中心,即用户必须根据所访问站点的拓扑结构来寻找自己所需要的信息,对于用户来说那些仅起到“导航”作用的页面并没有什么用处。为了过滤掉这些“导航”页面,提高网络服务质量,为用户提供个性化服务,人们提出了很多基于数据挖掘的Web应用,如预取技术、个性化推荐服务、自适应站点服务等。自适应站点的理论能够减少网络信息泛滥和服务缺乏个性化的问题,是网络服务更高层次的发展目标。
     本文研究的目的是利用Web日志挖掘理论实现网站的自适应。论文首先介绍了国内外Web日志挖掘领域的发展现状,系统地阐述了Web挖掘、Web日志挖掘的特点及过程,其次对自适应网站的特点以及设计原理与过程作了深入的研究。论文讨论了自适应网站系统构建过程中的主要步骤和关键算法,重点对Web日志数据预处理和Web日志数据挖掘做了深入研究。论文依据自适应网站的构建理论提出了一种自适应站点的设计模型,并对其各个模块进行了详细分析,同时依据海事大学服务器日志数据对文中提到的算法作了验证,证明了文中所设计的算法是正确合理的。最后运用该模型构建了一个简易的自适应网站系统,通过该系统的顺利运行验证了本文提出的自适应网站设计模型的可行性。
At present, a large part of the research of Web data mining focused the log mining of website. These web logs have recorded all actions of visitors. It will have practical significance for optimize the Web site with excavate these records so that we can find the visitors' prefer paths. Most of the existing Web sites' center is the "Page", that is users must be based on the topological structure of the Web site they visited to find the information they need. To filter out these "navigation" pages to improve network quality of service, to provide users with personalized service, people have invented a lot of Web-based data mining applications, such as prefetching, personalized recommendation Service, adaptive Web site services and others. The theory of adaptive Web site can reduce the problem of flood information network and the lack of personalized service, which is a higher level of network service development goals.
     The purpose of this paper is to use Web log mining theoretical achieve Adaptive Web site. Firstly paper introduce the development of Web log mining at home and abroad, particular expatiate the characteristics and process of Web mining, Web log mining, secondly paper made in-depth studies about the characteristic of adaptive Web site and design principles. This paper focused on the pretreatment of Web Logs and Web log data mining two parts. It also put forward to a model of adaptive web sites, to each part of this model this paper give instructions in detail. And based on the research about Web log mining and in-depth study of adaptive Web Site achieves the desired results. The well operation of the system presented in this paper show this model is feasibility.Finally,using this model constructed a simple adaptive Web site,and the smooth operation of system prove that the arithmetic are correct and reasonable.
引文
[1]CNNIC,第19次中国互联网络发展状况统计报告.2007-1-23.
    [2]刘建国.网络日志分析在网络安全中的应用,重庆工商大学学报,2004(4):384-387.
    [3]程继华,施鹏飞.多层次关联规则的有效挖掘算法.软件学报.1998.9(12)
    [4]徐滨士,欧忠文,马世宁等.纳米表面工程.中国机械工程,2000,11(6):707-712.
    [5]林杰,高翔等,基于Web服务的模型生成与链接.微型电脑应用,2004.20(4).18-20
    [6]Robert Bernier.Datamining Apache Logs with PostgreSQL.Bundesamt fur Sicherheit in der Informationstechnik.2002
    [7]Srivastava J.,Cooley R.,Deshpande M.,et al.Web usage mining:Discovery and applications of usage patterns from Web data[J].S IGKDD Explorations,2000,1(2):12223.
    [8]Daniel T.Larose.Data Mining Methods and Models.Wiley-IEEE Press,2006.200-201
    [9]M.Perkowitz,O.Etzioni,Adaptive Web sites:an Al challenge,in Proc.15th Int.Joint Conf.Al.,1997
    [10]M.Perkowitz,0.Etzioni,Towards Adaptive Web Sites:Conceptual Cluster Mining,in Proc(J).17~(th)Int.Joint Conf.AL,1999.
    [11]Eric Schwarzkopf.An Adaptive Web Site for the UM2001 Conference(J].the 8th International Conference on User Modeling,Sonthofen Germany,July 2001.
    [12]武新玲.自适应站点的研究与实现[D].浙江大学硕士学位论文2002,3:21-23,35-40,35-57
    [13]戴军湘.基于Web日志挖掘的自适应网站推荐系统框架研究.湖南大学硕士学位论文.2005.26-33
    [14]Frank Dellmann,Holger Wulff,SteFan Schmitz.Findings from a practical projectconcerning Web usage mining[C].Florida,USA:Proceedings of the Third IEEE International Conference on Data Mining.Melbourne,IEEE Computer Society,2003.1978-1981.
    [15]朱明.数据挖掘技术,科学出版社,2000
    [16]Web挖掘.http://net.pku.edu.cn/wbia/slides/12_1/Lecture WebMining.pdf 2006
    [17]吕锋,张炜玮.4种序列模式挖掘算法的特性研究,武汉理工大学学报,2006年2月
    [18]余强,张海盛.个性化Web信息服务技术研究闭.计算机应用研究,2006(2):198-200
    [19]Jaideep Sribastava,Robert Cooly,Mukund Deshpande,Pang-NingTan.Web Usage Mining:Discovered and Applications of Usage Patterns from Web Data[J].ACM SIGKDDExploration,2000,1(2):12-23
    [20]Roobert Cooley,Pang-NingTan,and Jaideep Srivastava.Discovery of Interesting UsagePatterns from Web data{A}.In Myra Spiliopoulou,editor,LNCS/LNAISeries{M}.Springer(Verlag),2000.
    [21]张龙翔,一种基于Web日志挖掘的频繁访问页组加强算法,临沂师范学院学报,2004.06,100-103
    [22]施建生,伍卫国,陆丽娜等,Web日志中挖掘用户浏览模式的研究,西安交通大学学报,2001.06,621-624
    [23]张银奎,廖丽.宋俊,等译.数据挖掘原理.北京:机械工业出版社,2003
    [24]凌志泉.Web日志挖掘技术的研究与自适应Web站点的构建.天津.天津大学.2003.33-45
    [25]De Bra,Paul.Design Issues in Adaptive Web-Site Development.In Proceedings of the 2nd Work shop on Adaptive Systems and User Modeling on the WWW.Canada.http://wwwis.win.tue.nl/asum99/debra/debra.html.2001.344-387
    [26Myra Spilipoulou,The Laborious Way From Data Mining to Web Log Mining[J],Int J on Computer Syst Sci and Eng,2006,14(1):113-125.
    [27]Cooley,R.,Mobasher,B,and Srivastava,J,Web Mining Information and Pattern Discovery on the World Wide Web,Proceedings of Ninth IEEE International Conference on Tools with Artificial Intelligence,1997:558-567.
    [28]戴东波 印鉴,基于Web挖掘的自适应站点优化设计,计算机科学2006 Vol.33 NO.4 126-129
    [29]庄玲盈Web数据挖掘在个性化自适应网站中的应用,重庆大学硕士学位论文7-8
    [30]方成效 基于web挖掘的自适应站点研究 华东交通大学说是学位论文2006
    [31]http://www.517xz.net/soft/3/59/2007/200708016875.html
    [32]Robert Cooley,Bamshad Mobasher,and Jaideep Srivastava,Data preparation for miningworld wide web browsing patterns,Knowledge Information Systems,2005,1(1):532.
    [33]李广都,李勇.基于Web挖掘的个性化服务研究[J].情报理论与实践,2004.27(1)72-76.
    [34]Xin Jin,Yanzan Zhou,Bamshad Mobasher:A maximum entropy web recommendation system:combining collaborative and content features,KDD 2005:612-617.
    [35]J.Pei,J.Han,B.Mortazavi-Asl,and H.Zhu,Mining Access Patterns Efficiently from Web Logs.In Proceedings of the 40 Pacific-Asia Conference on Knowledge Discovery and Data Mining[PAKDD'00],Kyoto,Japan,April 2000.
    [36]赵伟,何王康,陈霞,谢振亮.Web日志挖掘中的数据预处理技术研究.计算机应用.2003(5):62-67.
    [37]吕亚兵,尹朝庆,吴越,王新梅.基于聚类分析的Web服务器浏览模式的挖掘方法.武汉理工大学学报·信息与管理工程版,2005.27(6)

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700