基于关联规则数据挖掘技术的高校学生学习成绩分析

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于关联规则数据挖掘技术的高校学生学习成绩分析

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：The Analysis of College Student Achievement Based on Association Rules Mining Technology
作者：吴喜萍
论文级别：硕士
学科专业名称：教育技术学
中文关键词：数据挖掘 ; 成绩 ; Apriori算法 ; AprioriTid算法
英文关键词：Data Mining ; Achievement ; Apriori ; AprioriTid
学位年度：2010
导师：段凡丁
学科代码：040110
学位授予单位：西南交通大学
论文提交日期：2010-03-01

摘要

近年来随着高校不断扩招,学校学生人数和教师人数大幅度增加,给高校学生管理和教学工作带来了严峻的考验,传统的教学管理手段已经逐渐不能适应社会的发展了。高校有很多信息系统和各类数据库,如学籍管理系统、成绩管理系统、人事管理系统等,这些系统和数据库已经积累了大量的数据,但是由于缺乏必要的信息技术和手段,管理人员只能通过简单的统计分析、排序、备份等功能获得表面信息,隐藏在数据背后的信息不能得到有效利用。
     数据挖掘就是从历史数据集中发现隐含模式,并且应用这些模式进行预测。数据挖掘技术能够对已有的大量数据分析的基础上进行科学研究、商业决策或企业管理,从而达到为决策支持服务的目的。关联规则挖掘比较,是数据挖掘领域里最为活跃的研究方向之一,它反映一个事件和其他事件直接依赖或关联的知识。
     本文首先对数据挖掘做了一般性讨论,包括数据挖掘的历史、概念、相关技术。然后,对数据挖掘中重要的关联规则挖掘算法做了深入的研究,分析了关联规则挖掘算法中经典的Apriori算法及其AprioriTid算法,总结了算法中存在的问题,接着在AprioriTid算法基础上提出了改进算法。最后,利用改进算法,依据数据挖掘的标准流程对某高校2004级到2008级五个年级不同专业学生的《计算机程序设计基础与VF》课程成绩为研究对象,挖掘得到了影响成绩的因素,从而为提高教学质量提供依据。高校中可以挖掘的信息不仅仅是成绩,还可以对学生年龄(思维认知成熟度)、性别、爱好、家庭背景、健康状况、学籍、学历、高考成绩、课程内容、试卷、教师等信息进行数据挖掘,从而为管理者和教师提供决策依据,因人施教,提高高校教学水平和教学管理工作成效。
The fact, that with the enrollment expansion, the number of students and teachers of college has increased greatly in recent years, gives the management and teaching a severe trial. The traditional management and teaching methods can not adapt to the development of society. There are many information management systems and databases in the college, such as student registration management system, achievement management system and personnel management system. These systems and databases have accumulated large amount of data. In fact, higher education Administrators can obtain the superficial information only by the simple statistical analysis, sorting and backup. However, the information behind the data can not be utilized effectively.
     Data mining (DM) is focused on the discovery of model hidden from the historical data and use the models to predict the future. DM can make the scientific research, business decision and management based on large amount of data existed, so as to support the decision-making. Association mining is the one of most active research directions. It reflects the direct dependent or associated knowledge between one thing and other things.
     First of all, this thesis makes a general discussion about DM, including DM history, concepts and related technologies. Then, it makes deep research about the important association rule mining algorithms in DM, analyses the classical Apriori algorithm and AprioriTid algorithm, summs up the problems of the algorithm. Then, one improved algorithm has been proposed based on AprioriTid algorithm. A university 2004-2008 grade student achievement of the course "computer programming base and VF" as the research object, using the standard DM process and improved AprioriTid algorithm. This thesis obtain the factors affecting the student achievement, so as to supply the basis for improving teaching quality.
     We can not only mine the student achievement but also student age (maturity of cognitive thinking), gender, hobbies, family background, health status, student registration, education, college entrance examination results, course content, exam paper, teacher, and other information, so as to supply the decision-making basis for administrators and teachers, teach students according to their characters, improve the level of college teaching and management effectiveness.

引文

[1]数据挖掘的研究历史和现状.http://www.stcsm. gov.cn/learning/lesson/xinxi/20021125/lesson-3.asp
    [2]Michael J. A. Berry, Gordon S. Linoff著,袁卫等译.高管商学院数据挖掘[M].中国劳动社会保障出版社,2004
    [3]章成志.数据挖掘研究现状及最新进展[J].南京工业职业技术学院学报,2003
    [4]梁循.数据挖掘算法与应用[M].北京大学出版社,2006
    [5]ZhaoHui Tang, Jamie Maclennan. Data Mining with SQL Server 2005 [M]. Wiely Publishing,2005
    [6]福州大学空间数据挖掘与信息共享教育部重点实验室http://kj.fjedu.gov.cn/html/NewsView-222.html.
    [7]汤小文,蔡庆生.数据挖掘在电信业中的应用[J].计算机工程.2004
    [8]魏萍萍,王翠茹.数据挖掘技术及其在高校教学系统中的应用[J].计算机工程,2003
    [9]Tan Pangning, Steinbach M, Kumar V. Introduction to Data Mining [M].北京：人民邮电出版社,2006：201-305
    [10]张云涛,龚玲著.数据挖掘原理与技术[M].北京：电子工业出版社,2004
    [11]胡可云,田风占等.数据挖掘理论与应用[M].北京：清华大学出版社,北京交通大学出版社,2008：103-120
    [12]吕俊生.网上信息资源的链接分析研究[J].情报科学.2005年01期
    [13]吉根林.遗传算法研究综述[J].计算机应用与软件,2004,(02)
    [14]毛国君编著.数据挖掘原理与算法[M].北京：清华大学出版社,2005
    [15]毕建欣,张岐山.关联规则挖掘算法综述[J].中国工程科学,2005.4,第7卷第4期
    [16]邵峰晶,于中清.数据挖掘原理与算法[M].中国水利水电出版社,2003,103-114
    [17]蔡之华,颜雪松,李晖.挖掘关联规则的并行算法研究[J].计算机应用研究,2002
    [18]冯玉才,冯创讲.关联规则的增量式更新算法[J].软件学报、1998
    [19]林杰斌,刘明德,陈湘编著.数据挖掘与OLAP理论与实务[M].北京：清华大学出版社,2003
    [20]朱玉全,杨鹤标,孙蕾.数据挖掘技术[M].东南大学出版社,2006
    [21]刘芝怡.关联规则挖掘算法的分析、优化及应用[D].兰州大学高校教师申请硕士学位论文,2007
    [22]王艳.数据挖掘中关联规则算法的研究[D].西南交通大学研究生学位论文,2004
    [23]Agrawal.R, Imielinski.T, Swami.A., Mining Association rules between Sets of Items in large Databases, In Proc.1993 ACM-SIGMOD Int.Conf.Management of Data(SIGMOD'93), Washington D.C,July 1993:207-216
    [24]Houtsma.M and Swami.A. Set-Oriented Mining for Association Rules in Relational Databases, Proceedings of the 11th IEEE International Conference on Data Engineering,Taipei,China,1995
    [25]何丽君,董蕊.常见关联规则算法分析与比较[J].大连民族学院学报,2005.9,第7卷第5期
    [26]Agrawal R, Srikant S.Fast Algorithms for Mining Association Rules[C]//VLDB'94. Santiago,Chile:[s.n.],1994:487-499.
    [27]Park J S,Chen M S,Yu P S.An EffectiveHash-BasedAlgorithm for Mining Association Rules[C]//SIGMOD'95.SanJose,CA:[s.n.],1995:175-186
    [28]杜孝平,马秀莉,唐世渭等.快速关联规则挖掘算法[J].计算机工程与应用,2002(11)
    [29]Savasere A,Omiecinski E,Navathe S.An efficient algorithm for mining association rules in large databases[C].Proceedings of the 21st International Conference on Very large Database,1995.
    [30]Ng R,Lakshmanan L V S,Han J.Exploratory mining and pruning optimizations of constrained associations rules[C].Seattle, Washington:Proceedings of ACM SIGMOD International Con-ference on Management of Data,1998.13-24.
    [31]朱玉全,孙志挥,季小俊.基于频繁模式树的关联规则增量式更新算法[J].计算机学报,2003
    [32]叶飞跃,王建东,陈慧萍,等.基于哈希链结构的频繁模式挖掘[J].计算机工程与应用,2004(11)：174-176
    [33]冯洁,陶宏才.典型关联规则挖掘算法的分析与比较[J].计算机技术与发展,2007.3,第17卷第3期
    [34]I.H.Witten and E.Frank, Data Mining:Pracical Machine Learing Tools and Techniques with Java Implementations, Morgan Kaufmann,2nd,2005
    [35]李晓红,尚晋.一种改进的新Apriori算法[J].计算机科学,2007
    [36]颜雪松,蔡之华.一种基于Apriori的高效关联规则挖掘算法的研究[J].计算机工程与应用,2002
    [37]徐章艳,刘美玲.Apriori算法的三种优化方法[J].计算机工程与应用,2004
    [38]景永霞,王治和.一种新的Apriori改进算法[J].长春理工大学学报,2007
    [39]张梅峰,张建伟等.基于Apriori的有效关联规则挖掘算法的研究[J].计算机工程与应用,2003
    [40]袁鼎荣,严小卫.Apriori算法的复杂性研究[J].广西科学,2005
    [41]李飞雄.基于项目属性的关联规则提取[J].计算机学报,2002
    [42]袁晓玲,赵茜.关联规则挖掘算法的优化处理[J].河北省科学院学报,2005
    [43]胡吉明,鲜学丰.挖掘关联规则算法中的研究与改进[J].计算机技术与发展,2006
    [44]朱建平.数据挖掘的统计方法及实践[M].中国统计出版社,2005
    [45]朱明编著.数据挖掘.合肥：中国科学技术出版社,2002
    [46]Han J,Kamber M.数据挖掘：概念与技术[M].范明,孟小峰等译.北京：机械工业出版社,2001
    [47]朱玉全,孙志挥,赵传申.快速更新频繁项日集[J].计算机研究与发展,2003
    [48]崔立新,苑森积,赵春喜.约束性相联规则发现方法及算法[J].计算机学报,2000
    [49]刘明吉,王秀峰,黄亚楼.数据挖掘中的数据预处理[J].计算机科学,2000
    [50]铁治欣,陈奇,俞瑞刨.关联规则采掘综述[J].计算机应用研究,2000
    [51]李绪成,王保保.挖掘关联规则中Apriori算法的一种改进[J].计算机工程,2002

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700