用户名: 密码: 验证码:
面向网络安全监控的流数据处理技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着信息技术的不断发展,互联网在人们生活中扮演着越来越重要的角色。而随之而来的各种网络安全事件严重成胁着互联网的应用和发展。因此,以安全为目的的网络监控在维护网络正常高效运行、保障关键设施和确保信息系统安全等方面起着越来越重要的作用。如何对实时、海量的网络安全监控数据实施高效的在线分析,进而为各种应用提供进一步支持,成为网络安全和数据分析处理领域的一个研究重点。
     本文以网络安全监控为背景,针对网络安全监控的应用特点和实时监控数据分析处理存在的挑战性问题,从动态实时的数据流处理角度,研究了四类网络安全监控流数据的高效查询技术,分别为连续top.k监测、表连接优化、多查询优化、大时间窗口查询。本文的主要贡献为:
     1、改进了网络安全监控流中重要事件监测所使用的数据流连续top.k查询方法。建立有效的索引结构可以提高查询效率,而已有的数据流top.k索引一般基于网格索引,这些索引结构中存在大量自由数据点f可证明非top.k结果1。针对这一问题,本文提出了基于逆支配点集(reverse dominant point set,RI)PS)的top.k索引方法A.MC’R索引,该索引中通过逆支配点集性质_亨9枝了大量自由数据点,同时给出数据流中数据点加入和删除时肛MC!R的更新算法。理论分析和真实数据集上的实验证明肛McR索引在存储容量、查询效率等方面存在着优势。
     2、针对网络安全监控数据流分析中重要的表连接操作,研究了监控流与超大维表的表连接优化方法。IP地址维的数据量达到2”条记录,表连接过程中此类超大维表无法将整个表常驻内存,需划分为多个子块分块循环读入内存,造成磁盘I/频繁。基于这一问题,本文提出将超大维表按列分解压缩并常驻内存的多动态索引嵌套循环连接算法fMDI.NL.Join)。超大维表按其m个属性列划分为m个子维表并压缩,同时建立子维表索引;表连接时按查询语句中的投影所确定的属性列来动态调度对应的子维表索引,以确保使用冗余最小的子维表并利用索引提高连接探测效率。理论分析和实验表明该连接算法特别适合于维表连接键存在较大冗余的超大维表f如IP维表、银行账户维表等1。由于维表压缩为较小子表且压缩是无损的,表连接时每个嵌套循环过程探测扫描维表开销减少,提高了表连接效率并降低了存储维表的内存开销。
     3、对网络安全监控数据流的多个并发查询进行优化。数据流多查询优化可使用优化多查询计划和物化共享中间结果两种方法。现有的物化共享中间结果研究中使用物化视图或索引方式,且未对结果压缩存储。针对这一问题,本文提出了采用流数据方形式存储物化共享中间结果的方法,以及流数据方的压缩存储结构——压缩流数据方fcompressed Stream(:ube)。该方法以经典数据方树形压缩存储结构Qc.tree和Dw”f为基础,为进一步减少单位时间流数据方切片占用的内存空间,根据多查询需求将压缩存储结构中的完全物化所有中间结果改为部分物化;并建立结点物化收益模型,采用动态选择的方法按每个查询频率选择非物化结点。压缩流数据方以经典的压缩存储结构为基础,动态选择方法_亨9枝了大量非查询需求的物化结点,该方法虽然不可避免的_亨9枝了部分查询需求的物化结点,使查询性能降低,但在有限内存空间中较好的压缩了流数据方。以streamC)c_rree为例,通过理论分析和实验,在网络安全数据流中单位时间处理少量查询情况下可_亨9枝大量结点,以少量的降低查询响应速度为代价,有效地压缩了流数据方存储空间。
     4、针对网络安全监控流的大时间窗口查询问题进行了研究。由于计算和存储资源限制,数据流系统的查询时间窗口w限定在当前一段时间范围内,超出窗口w的大时间窗口查询无法计算。针对这一问题,本文通过综合数据流系统的实时计算及传统数据库系统的海量存储优势提出了增量式混合存储体系结构。数据流系统计算时采用“分治法”将时间窗口等分为多个不重叠的小时间块,只计算小时间块而获得整个时间窗口的结果,从而提高数据流系统计算效率。在传统数据库中采用物化视图存储窗口w外查询结果,并增量式接收小时问块结果来更新物化视图,避免了传统数据库中低效更新计算,提高了传统数据库的物化视图更新效率。该系统计算范围限于可增量计算的聚集查询,可适用于联机在线分析领域。
     综上所述,本文基于网络安全监控流数据处理应用中亟待解决的数据分析处理与查询问题,就数据流连续top.七监测、表连接、多查询优化、大时间窗口查询等几个重要的基础问题的关键算法进行了突破性研究,对于促进大规模网络安全监控数据分析处理的理论研究和实用化具有一定的理论意义和应用价值。
The continuous development of informationization makes the Internet increasinglysignificant in our lives; while the emergence of various security events during theprocess highly threatens the application&development of the network. As a result of it,network security monitoring is essential to network maintenance, key infrastructureprotection and information system security. And one of the most challenging issuesoccurring in network monitoring and data analysis processing is how to process andquery the real-time massive monitoring data in an efficient manner, thus to providesupport for various follow-up applications.
     Based on the background of network security monitoring (NSM) and faced withthe above challenges, this dissertation focuses on the cost-efficient solutions to severalbasic problems derived from the difficulty in analyzing the dynamic real-time datastream, such as continuous top-k queries, join, multi-query optimization, big timewindow queries. The main contributions are concluded as follows:
     1. Continuous top-k queries on NSM data streams. An index of data stream canimprove the performance of queries efficiently. However, a grid index is usually usedfor continuous top-k, in which a lot of free data points (proved to be not top-k results)are included. Directing to prune those points, we propose an index structure, k-maxcalculating region (k-MCR) based on the reverse dominant point set (RDPS) and gridindex. We get k-MCR through calculating RDPS within grid index and pruning the unitsapproximately and quickly in grid, putting forward the updating algorithm of adding inand deleting data points in data stream. Analytical and experimental evidences displaythat k-MCR index approach performs better on both storage of index and efficiency ofqueries.
     2. Join algorithm for huge dimension tables on NSM data streams. We proposeMDI-NL join algorithm to optimize join between huge dimension tables (such asIPaddress tables, containing 232 tuples) and NSM data streams, which reduces theconsumption of CPU power and memory capacity. Generally speaking, a hugedimension table should be partitioned into small sub-tables by row and each sub-table isloaded into memory in turn, resulting in frequent disk I/O. However, Compressing hugedimension tables into small in-memory tables will improve the efficiency. So, we findthat the join key of a dimension table is so large and redundant for join, which can beused to compress dimension tables losslessly. We divide each dimension table into nsub-dimension tables by column, then compress those sub-dimension tables’s join keyand build an index for each sub-dimension table. In join operation, MDI-NL selectsindices of sub-dimensional tables dynamically according to the projection of query,which makes sure that each choosed sub-dimensional tables has the least join key rebundancy. Theoretical analysis and experimental evidences show that MDI-NL ismore adaptable for join with the huge dimension table in which there are largeredundant join keys. Since dimensional tables are compressed to relatively small andlossless sub-dimensional tables, the cost of scanning in each nest-loop and storage spaceis decreased, and therefore performance of table join is improved.
     3. Multi-query optimization on NSM data streams. Two types of methods can beadopted for multi-query optimization in data stream: optimization of query plan, andmaterialized intermediate results. It is common used of materialized views or indices inthe existing approaches of materialized intermediate results, which has no compressionon intermediate results. To solve such a problem, we propose compressed stream cubeas the storage structure of materialized intermediate results based on the compressedcube (QC-tree and Dwarf). For further reducing the space, part of the nodes arematerialized in the compressed model, different from the traditional method that all thenodes are materialized in a compressed model. In our model, the compressed nodes aredivided into basic nodes and additive nodes; in which the former ones are those thatmeet the validity of queries, and the latter ones are used to increase the speed inresponse of a specific query. This approach adopts an effective pruning technique tominimize the number of elements in the limited memory space. By this method, lots ofunused nodes in a query are pruned; while some useful materialized nodes are alsoinevitably pruned at the same time, leading to the comparatively decrease of efficiency.Theoretical analysis and experimental evidences (StreamQCtree as an example) indicatethat our approach can discard a great number of nodes under the circumstance ofprocessing fewer queries in a certain amount of time, which can effectively reduce thestorage of stream cube via a little efficiency loss in query.
     4. Big time window queries on NSM data streams. In the context of limited CPUpower and storage, DSMS (Data Stream Management System) just can deal withqueries within current time window W, while big time window queries which arebeyond the time window W can not be handled. Aiming at this problem, we propose anincremental hybrid storage architecture that combines real-time DSMS and mass storageDBMS (DataBase Management System). To improve the efficiency of DSMS, timedimension is "divide-and-conquer" into small non-overlapping time window blocks, andwe only handle data within the same small block and combine those small blocksqueries into a big time window query. Furthermore, materialized views are used to storethe query results beyond window W in DBMS, and updated by incrementally loadingthe results in expired small blocks from DSMS, free from the inefficient updating,which improves the efficiency in updating and maintenance of materialized views. Itshould be mentioned that this system only can do with aggregation operation in OLAP.This dissertation addresses the problem needed to be solved urgently incost-efficient analysis processing and queries for NSM data streams. A lot of work has been done on the continuous top-k, join, multi-query and big time window query,bringing a breakthrough to the field. This is a promotion of large-scale networkmonitoring data processing on both theoretical study and practical applications.
引文
[1]中国互联网络信息中心第26次中国互联网络发展状况统计报告[Jll2010
    [2]Castells Manuel The Rise ofThe Network Society[M]:Wiley press,1996
    [3]中国互联网络信息中心,国家互联网应急中心2009年中国网民网络信息安全状况调查报告[EB/OL]http://www cert org cn/UserFiles/File/1 doc,2010
    [4]国家互联网应急中心2009年中国互联网网络安全报告[EB/OL] http://www cert org cn/IJserFiles/File/cNcERT Annual Report 2009(3)pdf, 2010
    [5]国家计算机病毒应急处理中心,计算机病毒防治产品检验中心2009年中国计算机病毒疫情调查技术分析报告[Jl信息网络安全,2010,4(3):75-77
    [6]The Economist The treat from the intemet:Cyberwar[EB/OL]http://www economist corn/node/16481504 2010
    [7]The Economist Cyberwar:Wax in the fifth domain[EB/OLIhttp://www economist corn/node/16478792 2010
    [8]姜朋,陈海滨互联网安全事件应急处理及案例[c]会国网络与信息安全技术研讨会2004北京:242.252
    [9]Richard Bejtlich The Tao ofNetwork Security Monitoring Beyond IntrusionDetection[M]New-York:Addison Wesley,2004
    [10]田李面向网络安全监控的数据流关键技术研究[D]长沙:国防科学技术大学,2008
    [11]Y Dora Cai,David Clutter,Greg Pape,et al MAIDS:Mining Alarming Incidentsfrom Data Streams[C]//Proceedings ofthe 2004 ACM SIGMOD internationalconference on Management ofdata,2004,Paads,France:ACM Press:919-920
    [12]C R Kalmanek,Z Ge,S Lee,et al Darkstar:Using exploratory data mining toraise the bar on network reliability and performance[C]//The 7th InternationalWorkshop on Design of Reliable Communication Networks(DRCN),2009,Washington.D C:IEEE
    [13]Divesh Srivastava,Lukasz Golab,Rick Greer,et al Enabling Real Time DataAnalysis[J]Proceedings ofthe VLDB Endowment,2010,3(1-2):1-2
    [14]R Greer Daytona and the fourth-generation language Cymbal[C]//Proceedingsofthe 1999 ACM SIGMOD Intemational Conference on Management ofData,1999,Philadelphia,USA:ACM Press:525—526
    [15]C D Cranor,T Johnson,O Spatscheck,et al A Stream Database for NetworkApplications[C]//Proceedings ofthe 2003 ACM SIGMOD intemationalconference on Management ofdata,2003,San Diego,USA:ACM Press:647—651
    [16]L Golab,T Johnson,J S Seidel,et al Stream warehousing with DataDepot[C]//Proceedings ofthe 35th ACM SIGMOD International Conference onManagement of Data,2009,Rhode Island,USA:847—854
    [17]Yuan-Chi Chang,Lawrence Bergman,Vittofio Castelli,et al The Oniontechnique:Indexing for linear optimization queries[Jl ACM SIGMOD Record,2000,29(4):391-402
    [18]Stephan B6rzs6nyi,Donald Kossmann,Konrad Stocker The skyline operator[C]//Proceedings ofthe 17th International Conference on Data Engineering,200 1,Heidelberg,Germany:IEEE Computer Society:421-430
    [19]Brian Babcock,Shivnath Babu,Mayur Datar,et al Models and issues in datastream systems[C]//Procceding ofthe 2lth ACM Symposium on Principles ofDatabase Systems.2002.Madison:ACM:1-16
    [20]金澈清,钱卫宁,周做英流数据分析与管理综述[Jl软件学报,2004,15(8):1172—1181[2 1]Yunyue Zhu,Dennis Shasha StatStreaan:Statistical Monitoring of Thousands ofData Streams in real Time[C]//Proceedings ofthe 28th international conferenceon Very Large Data Bases,2002,Hong Kong,China:VLDB Endowment:358.369
    [22]Ya~lnis E Ioannidis,Viswanath Poosala Histogram-Based Approximation ofSet-Valued Query-Answers[C]//Proceedings ofthe 25th InternationalConference on Very Large Data Bases,1999,Edinburgh,Scotland:MorganKaufmann Publishers Inc:174.185
    [23]Swarup Acharya,Phillip B Gibbons,Viswanath Poosala Congressional Samplesfor Approximate Answering ofGroup-By Queries[J]ACM SIGMOD Record,2000,29(2):487-498
    [24]Swarup Acharya,Phillip B Gibbons,Viswanath Poosala,et al Join Synopses forApproximate Query Answering[J]ACM SIGMOD Record,1999,28(2):275.286
    [25]Surajit Chaudhuri,R~eev Motwan~Vivek Narasayya On Random SamplingOver Joins[J]ACM SIGMOD Record,1999,28(2):263-274
    [26]Kaushik Chakrabarti,Minos Oaacothlakis,R~eev Rastogi,et al ApproximateQuery Processing Using Wavelets[J]The VLDB Journal,2001,10(2-3):199.223
    [27]Scott Jeffrey V~ter,Min Wang Approximate Computation of Multi-dimensionalAggregates ofSparse Data Using Wavelets[C]//Proceedings ofthe 1999 ACMSIGMOD international conference on Management ofdata,1999,Philadelphia,Pennsylvania,United States:ACM:193。204
    [28]Philippe Flajolet,G Nigel Martin Probabilistic Counting[C]//Proceedings ofthe24th Annual Symposium on Foundations of Computer Science,1983,Tucson,USA:IEEE Computer Society
    [29]Abhinandan Das,Sumit Ganguly,Minos Garot'alakis,et al DistributedSet-Expression Cardinality Estimation[C]//Proceedings ofthe Thirtiethinternational conference on Very large data bases,2004,Toronto,Canada:VLDB Endowment:3 12.323
    [30]A Arasu,B Babcock,S Babu,et al STREAM:The Stanford Stream DataManager[~IEEE Data Engineering Bulletin,2003,26(1):19-26[3 1]D J Abadi,D Carney,U~etintemel,et al Aurora:A Data Stream ManagementSystem[C]//Proceedings ofthe 2003 ACM SIGMOD international conferenceon Management ofdata,2003,California,USA:ACM
    [32]Daniel J Abadi,Don Carney,Ugur~etintemel,et al Aurora:a nev~model andarchitecture for data stream management[J]VLDB Journal,2003,12(2):120.139
    [33]Daniel J Abadi,Yanif Ahmad,Magdalena Balazinska,et al The Design oftheBorealis Stream Processing Engine[C]//Proceedings ofthe 2nd BiennialConference on Innovative Data Systems Research,2005,Asilomar,CA:277-289
    [34]Ying Xing,Stan Zdon~,Jeong-Hyon Hwang Dynamic Load Distribution in theBorealis Stream Processor[C]//Proceedings ofthe 2 1st International Conferenceon Data Engineering,2005,Tokyo,Japan:791-802
    [35]Yamif Ahmad,Anjali Jhingran,Bradley Berg,et al Distributed Operation in theBorealis Stream Processing Engine[C]//Proceedings ofthe 2005 ACM SIGMODinternational conference on Management of data,2005,Baltimore,Maryland:ACM:882.884
    [36]Magdalena Balazinska,H.dri Balakrishnan,Samuel Madden,et al Fault-tolerancein the Borealis distributed stream processing system[C]//Proceedings ofthe 2005ACM SIGMOD international conference on Management of data,2005,Baltimore,Maryland:ACM:13-24
    [37]Sirish Chandrasekaran,Owen Cooper,Amol Deshpande,et al TelegraphCQ:Continuous Dataflow Processing for an Uncertain World[C]//Proceedings oftheFirst Biennial Conference on Innovative Data Systems Research,2003,Asilomar,CA:Morgan Kaufman Publishers:269-280
    [38]Frederick Reiss,Joseph M Hellerstein Data Triage:An Adaptive Architecturefor Load Shedding in TelegraphCQ[C]//Proceedings ofthe 21st InternationalConference on Data Engineering,2005,Tokyo,Japan:IEEE Computer Society:155.156『391 Sirish Chandrasekaran,Owen Cooper,Amol Deshpande,et al TelegraphCQ:Continuous Dataflow Processing[C]//Proceedings ofthe 2003 ACM SIGMODinternational conference on Management of data,2003,California,USA:668.668
    [40]Chuck CraJaor,Yuan Gao,Theodore Johnson,et al Gigascope:High PerformanceNetwork Monitoring with An SQL Interthce[C]//Proceedings ofthe 2002 ACMSIGMOD international conference on Management of data,2002,Madison,Wisconsin:ACM
    [41]Chuck Cranor,Theodore Johnson,Oliver Spataschek Gigascope:A StreamDatabase for Network Applications[C]//Proceedings ofthe 2003 ACMSIGMOD international conference on Management of data,2003,California,USA:ACM:647.65 1
    [42]Theodore Johnson,S Muthukrishnan,Vladislav Shkapenyuk,et al A Hea~beatMechanism and Its Application in Gigascope[C]//Proceedings ofthe 31stinternational conference on Very large data bases,2005,Trondheim,Norway:VLDB Endowment:1079.1088
    [43]Arvind Arasu,Gurmeet Singh Manku Approximate Counts and Quantiles overSliding Windows[C]//Proceedings ofthe Twenty-third ACMSIGACT-SIGMOD-SIGART Symposium on Principles ofDatabase Systems(PODS),2004,Paris,France:ACM:286—296
    [44]Zhihong Chong,Jeffrey Xu Yu,Zhengjie Zhang,et al Efficient Computation of k-Medians over Data Streams Under Memory Constraints[J]J0uRNAL OF COMPUTER SCIENCE AND TECHNOLOGY,2006,26(2):284—296
    [45]Brian Babcock,Mayur Datar,Rajeev Motwaafi,et al Maintaining Variance andk-medians over Data Stream Windows[C1.||Proceedings ofthe Twenty-SecondACM SIGACT-SIGMOD-SIGART Symposium on Principles ofDatabaseSystems(PODS),2003,San Diego,CA,usA:ACM:234—243
    [46]Sudipto Guha,Andrew-McGregor Approximate Quantiles and the Order ofthe Stream[C]//Proceedings ofthe Twenty-Fifth ACM sIGAcT-SIGMOD-sIGART Symposium on Principles ofDatabase Systems(PODS),2006,Chicago,Illinois,USA:273—279
    [47]Graham Cormode,Flip Korn,S Muthukrishnan,et al Space-and Time-EfficientDeterministic Algorithms for Biased Quaaatiles over Data Streams[C]//Proceedings ofthe twenty-fifth ACM SIGMOD-SIGACT-SIGART symposiumon Principles of database systems,2006,Chicago,IL,USA:ACM:263—272
    [48]Linfeng Zhang,Yong Guaaa Variance Estimation over Sliding Windows[C] //Proceedings ofthe Twenty-Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles ofDatabase Systems(PODS),2007,Beijing,China:225—232
    [49]Mohamed Medhat Gaber,Arkady Zaslavsky,Shonali Krishnaswamy Mining Data Streams:A Review[J]ACM SIGMOD Record,2005,34(2):18-26
    [50]John Miles Smith,Diane C P smith Database Abstractions:Aggregation and Generalization[J]ACM Transaction on Database System,1977,2(2):105-133[5 1]M Datar,A Gionis,P Indyk,et al Maintaining stream statistics over sliding windows,in Proceedings ofthe thMeeuth annual ACM-SIAM symposium on Discrete algorithms 2002,ACM Press:Philadelphia,USA P 635—644
    [52]Phillip B Gibbons,Srikanta Tirthapura Distributed streams algorithms for sliding windows,in Proceedings ofthe fourteenth annual ACM symposium on Parallel algorithms and architectures 2002,ACM Press:New-York,USA P 63-72
    [53]Arvind Arasu,Jennifer Widom Resource Shaadng in Continuous Sliding-Window Aggregates,in Proceedings ofthe Thirtieth International Conference on Very Large Data Bases 2004,Morgan Kaufmann:Toronto,Canada P 336-347
    [54]郭龙江,李建中,王伟平,等数据流上的连续预测聚集查询[Jll计算机研究与发展,2004,41(101:1690.1695
    [55]李建中,郭龙江,张冬冬,等数据流上的预测聚集查询处理算法[Jl软件学报,2005,16(7):1252-1 126
    [56]Brian Babcock,Chris Olston Distributed Top-k Monitoring[C]//Proceedings of the 2003 ACM SIGMOD international conference on Management of data,2003,New-York,USA:ACM Press:28-39
    [57]Mmji Wu,Jianliang Xu,Xueyan Tang,et al Top-k Monitoring in Wireless Sensor Networks[J]IEEE Transactions on Knowledge and Data Engineering,2007,19(7):962-976
    [58]Kyriakos Mouratidis,Spiridon Bakiras,Dimitris Papadias Continuous Monitoring ofTop-k Queries over Sliding Windows[C]//Proceedings ofthe 2006 ACM SIGMOD International Conference on Management of data,2006,Chicago,IL,USA:ACM:635-646
    [59]Gautam Das,Dimitrios Gunopulos,Nick Koudas,et al Ad-hoc Top-k Query Answering for Data Streams[C]//Proceedings ofthe 33rd international conference on Very large data bases,2007,Vienna,Austria:VLDB Endowment:183—194
    [60]Cheqing Jin,Ke Yi,Lei Chen,et al Sliding-window Top-k Queries on Uncertain Streams[J]VLDB Journal,2010,19(3):41 1-435
    [61]Ming Hua,Jian Pei Continuously monitoring top-k uncertain data streams:a probabilistic threshold method[J]Distributed and Parallel Databases,2009,26(1):2965
    [62]邓波分布式序敏感查询处理与监测关键技术研究[D]长沙:国防科学技术大学.2006
    [63]苏亮数据流分析关键技术研究[D]长沙:国防科学技术大学,2008
    [64]袁志坚数据流突发检测若干关键技术研究[D]长沙:国防科学技术大学,2008
    [65]Ronald Fagin,alTlnon Lotem,Moni Naor Optimal Aggregation Algorithms forMiddlewaJce[C]//Proceedings ofthe twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems,200 1,California,USA:ACM:102.113
    [66]Zou Lei,Chen Lei Dominant Graph:An Effcient Indexing Structure to AnswerTop-k Queries[c]//Proceedings ofthe IEEE 24th International Conference onData Engineering,2008,Washington,DC:IEEE Computer Society:536-545
    [67]Xin Dong,Chen Chen,Haaa JiaWei Towards Robust Indexing for RankedQueries[c]//Proceedings ofthe 32nd International Conference on Very LargeData Bases,2006,Seoul,Korea:VLDB Endowment:235-246
    [68]Vagelis Hristidis,Nick Koudas,Yannis Papakonstantinou Prefer:A system forthe efficient execution ofmulti-parametric ranked queries[J]ACM SIGMODRecord,2001,30(2):259-270
    [69]Gautam Das,Dimitrios Gunopulos,Nick Koudas,et al Answering Top-k QueriesUsing Views[C]//Proceedings ofthe 32nd International Conference on VeryLarge Data Bases,2006,Seoul,Korea:VLDB Endowment:45 1-462
    [70]Dong Xin,Jiawei Han,Hong Cheng,et al Answering Top-k Queries withMulti-Dimensional Selections:The Ranking Cube Approach[c]//Proceedings ofthe 32nd international conference on Very large data bases,2006,Seoul,Korea:VLDB Endowment:463.474
    [71]Gautam Das,Dimitrios Gunopulos,Nick Koudas,et al Ad-hoc Top-k QueryAnswering for Data Streams[C]//Proceedings ofthe 33rd internationalconference on Very large data bases,2007,Vienna,Austria:VLDB Endowment:183.194
    [72]Akrivi Vlachou,Christos Doulkeridis,K-ietil Norv姆,et al On Efficient Top-kQuery Processing in Highly Distributed Environments[c]//Proceedings ofthe2008 ACM SIGMOD international conference on Management of data,2008,Vancouver,Canada:ACM:753-764
    [73]杨蓓,黄厚宽挖掘数据流界标窗V1top-k频繁项集[Jl计算机研究与发展,2010,47r3、:463-473
    [74]陈冠华,马秀莉,杨冬青,等面向高维数据的低冗余top-k异常点发现方法[Jll计算机研究与发展,2010,47(5):788.795
    [75]田李,邹鹏,李爱平,等基于网格索引的连续Skyline计算方法[Jl计算机学报,2008,31(6):998-1012
    [76] J A Orenstein,T H Merrett A class of data structures for associativeseaxching[C]//Proceedings ofthe 3rd ACM SIGACT-SIGMOD symposium on Principles ofdatabase systems,1984,Waterloo,Canada:ACM:181-190
    [77]Ken C K Lee,Baihua Zheng,Hu~ing Li,et al Approaching the sk3’line in Z order[C]//Proceedings ofthe 33rd International Conference on Very Large Data Bases,2007,Vienna,Austria:VLDB Endowme~:279-290
    [78]Blackard J A The Forest CoverType Dataset[EB/OL] http://archive ics uci edu/ml/machine-learning-databases/covtype/
    [79]Mike StonebraJ~er,Daniel J Abadi,Adam Batkin,et al C-Store:A Column-oriented DBMS[c]//Proc ofthe 31st Very Large DataBase Conference,2005,Trondheim,Norway:553-564
    [80]Daniel J Abadi,Samuel R Madden,Miguel C Ferreira Integrating Compression and Execution in Column-Oriented Database Systems[C]//Proc ofthe 2006 ACM SIGMOD conference on Management ofdata,2006,Chicago:ACM:671-682[8 1]Daniel J Abadi,Peter A Boncz,Stavros Harizopoulos Column-oriented Database Systems[C]//Proc ofthe 35st Very Large DataBase Conference,2009,Lyon,France
    [82]Goetz Graefe,Leonard D Shapiro Data compression and database performance[C]//Proc ofSymposium on Applied Computing,1991,Kansas City:22.27
    [83]Mark A Roth,Scott J Van Horn Database compression[J]SIGMOD Record,1993,22(3):3 1-39
    [84]Daniel J Abadi,Samuel R Madden,Miguel C Ferreira Integrating compression and execution in column-oriented database systems[C]//Procceding ofthe 2006 ACM SIGMOD Int Confon Management ofData,2006,Chicago:ACM:671-682
    [85]Utkarsh Srivastava,Jennifer Widom Memory-lim~ed execution ofwindowed streamjoins[C]//Procceding ofthe 30th Int Confon Very Large Data Bases,2004,Toronto:VLDB Endowment:324-335
    [86]Jaewoo Kang,Jeffrey F Naughton,Stratis D Viglas Evaluating window joins over unbounded streams[C]//Procceding ofthe 19th Int Confon Data Engineering,2003,Bangalore:IEEE:341-352
    [87]Jiawei Han,Yixin Chen,GuoZhu Dong,et al Stream Cube:An architecture for multi-dimensional analysis ofdata streams[J]Distributed and Parallel Databases,2005,18(2):173-197
    [88]Tamer O Lukasz G M Processing sliding window multi-joins in continuous queries over data stream[C]//Procceding ofthe 29th Int Confon Very Large Data Bases,2003,Berlin:VLDB Endowment:500-5 1 1
    [89]Abhinandan Das,Johannes Gehrke,Mirek Riedewald Approximate joinprocessing over data streams[C]//Procceding ofthe 2003 ACM SIGMOD Int Confon Management ofData,2003,Sam Diego:ACM:40-51
    [90]Stratis D Viglas,Jefl}ey F Naughton,Josef Burger Maximizing the output rate ofmulti-wayjoin queries over streaming information sources[C]//Procceding of the 29th Inter Confon Very Large Data Bases,2003,Berlin:VLDB Endowme~:285.296[9 1]Moustat'a A Hammad,Walid G aref,Ahmed K Elmagarmid Query processing ofmulti-way stream window joins[J]The VLDB Journal,2008,17(3):469-488
    [92]蒋旭东,周立柱数据仓库查询处理中的一种多表连接算法[Jll软件学报,2001,12(02):190-195
    [93]文娟,薛永生,翁伟,等数据仓库中的一种提高多表连接效率的有效方法[Jll计算机研究与发展,2005,42(111:2010.2017
    [94]PATRICK VALDURIEZ Join indices[J]ACM Transaction on Database System, 1987,12(2):218-246
    [95]Rudolf Bayer The universal B-Tree for multidimensional indexing:general concepts[C]//Procceding ofthe Int Confon Worldwide Computing and Its Applications,1997,Tsukuba:Springer:198-209
    [96]Peter J Haas,Joseph M Hellerstein Ripple joins for online aggregation[C] //Procceding ofthe 1999 ACM SIGMOD Int Confon Management ofData,1999,Philadelphia:ACM:287-298
    [97]王伟平,李建中,张冬冬,等基于滑动窗口的数据流连续J.A查询的处理方法[Jl软件学报,2006,17(04):740-749
    [98]Jochen Van Den Bercken,Bernhard Seeger,Peter Widmayer The bulk index join:A generic approach to processing non-equ~oins[C]//Procceding ofthe 15th Int Confon Data Engineering,1999,Washington,DC:IEEE Computer Society:257
    [99]David J Dewitt,Jeffrey F Naughton,Donovan A Schneider An Evaluation of Non-Equijoin Algorithms[C]//Proceedings ofthe 17th International Conference on Very Large Data Bases,1991,Catalonia,Spain:Morgan Kaufmann:443-452
    [100]Edwin H Jacox,Haman Samet Spatial join techniques[J]ACM Transactions on Database Systems,2007,32(1):265-309
    [101]Jingren Zhou,Kenneth A Ross Bufl'ering accesses to memory-resident index structures[C]//Proceedings ofthe 29th international conference on Very large data bases,2003,Berlin,Germany:VLDB Endowme~:405-416
    [102]Yannis Sismanis,Antonios Deligiannakis,Nick Roussopoulos,et al Dwarf:Shrinking the petacube[C]//Proc ofthe 2002 ACM-SIGMOD international conference management of data,2002,Madison,Wisconsin:ACM:464—475
    [103]TIMOS K SELLIS Multiple-Query optimization[J]ACM Transactions onJatabase Systems,1988,13(1):232
    [104]Prasan Roy,S Seshadri,S Sudarshan,et al Efficient and Extensible Algor~hms for Multi Query optim~aion[C]//Proceedings ofthe 19th ACM SIGMOD International Conference on Management of Data,2000,Dallas,USA:ACM Press:249.260
    [105]Goetz Graefe,William J McKenna The Volcano Optimizer Generator:Extensibility and Efficient Search[C]//Proceedings ofthe 9th International Conference on Data Engineering,1993,Vienna,Austria:IEEE Computer Society:209.218
    [106]Nilesh N Dalvi,Sumit K Sanghai,Prasan Roy,et al Pipelining in Multi-Query Optimization[C]//Proceedings ofthe twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles ofdatabase systems,2001,Santa Barbara,USA:ACM Press:59-70
    [107]A Gupta,S Sudarshan,S Vishwanathan Query Scheduling in Multi-query Optimization[C]//Proceedings ofthe 200 1 International Symposium on Database Engineering&Applications,2001,Grenoble,France:ACM Press:11-19
    [108]Alin Dobra,Minos GarotMakis,Johannes Gehrke,et al Sketch-Based Multi-Query Processing over Data Streams[C]//Proceedings ofthe 9th International Conference on Extending Database Technology,2004,Heraklion,Greece:Springer:55 1--568
    [109]Jian Li,A Deshpande,S Khuller Minimizing Communication Cost in Distributed Multi-query Processing[C]//Proceedings ofthe IEEE 25th International Conference on Data Engineering,2009,Shanghai,China:IEEE Computer Society:772-783[1 10]Laks V S Lakshmanan,Jian Pei,Yan Zhao QC-trees:An efficient summary structure for semantic OLAP[C]//Proc ofthe 2003 ACM SIGMOD International Conference on Management of Data,2003,California:ACM:64-75
    [111]VenkyHaadnaacayan,AnandRajaraman,Jefl}eyD Ullman ImplementingData Cubes Efficiently[C]//Proc ofthe 1996 ACM SIGMOD International Conference on Management ofData,1996,Montreal,Canada:ACM:205-216[1 12]Jim Gray,Adam Bosworth,Andrew-Layman,et al Data cube:A relational aggregation operator generalizing group-by,cross-tab,and sub-totals[C] //Proceedings ofthe Twelfth International Conference on Data Engineering,1996,New-Orleans:IEEE Computer Society:152-159[1 13]Kevin Beyer,Raghu Ramakrishnan Bottom-Up Computation of Sparse and Iceberg CUBEs[C]//Proceedings ofthe 1999 ACM SIGMOD international conference on Management ofdata,1999,Philadelphia:ACM:359-370[1 14]Wei Wang,Jianlin Feng,Hongjun Lu Condensed cube:An effective approach to reducing data cube size『C1//Proc ofthe 2002 International Conference on DataEngineering,2002,San Fransisco,CA:IEEE Computer Society:155-165
    [115]李盛恩,王珊封闭~4JDataCube)~其查询处理[Jl软件学报,2004,15(8):165.171
    [116]Laks V S Lakshmanan,Jian Pei,Jiawei Han Quotient Cube:How-to Summarize the Semantics ofa Data Cube[C]//Proceedings ofthe 28th international conference on Very Large Data Bases,2002,HongKong,China:VLDB Endowment:778—789[1 17]HongSong Li,HouKuam Huang PMC:Select Materialized Cells in Data Cubes[J] Journal ofcomputer science and technology,2006,21(2):297-305
    [118]向隆刚,龚健雅一种高度浓缩和语义保持的数据立方[Jll计算机研究与发展,2007,44(5):837-844
    [119]王栩,李建中,王伟平基于滑动窗口的数据流压缩技术及连续查询处理方法[Jl计算机研究与发展,2004,41(10):1639-1644
    [120]Moo~ung Cho,Jian Pei,Ke Wang Answering ad hoc aggregate queries from data streams using prefix aggregate trees[J]Knowledge and Information Systems,2006,12(3):301-329
    [121]谭红星,周龙骧多维数据实视图的动态选择[Jll软件学报,2002,13(06):1090.1906
    [122]林子雨,杨冬青,王腾蛟,等实视图选择研究[Jll软件学报,2009,20(2):193.213
    [123]Lukasz Golab,Theodore Johnson Consistency in a Stream Warehouse[C] //Proceedings ofthe 5th Biennial Conference on Innovative Data Systems Research,2011,Asilomar,USA:114-122
    [124]L Golab,T Johnson,V Shkapenyuk Scheduling Updates in a Real-Time Stream Warehouse[C]//Proceedings ofIEEE 25th International Conference on Data Engineering,2009,Shanghai,China:IEEE Computer Society:1207-1210
    [125]Amit Shukla,Prasad Deshpande,Jeffrey F Naughton Materialized View- Selection for Multidimensional Datasets[C]//Proceedings ofthe 24th Intenr~ional Conference on Very Large Data Bases,1998,NewYork,USA:Morgan Kaufmann Publishers:488 499
    [126]Nathan Fo~ert,Abhinav Gupta,Andrew-Witkowski,et al Optimizing Refresh of a Set ofMaterialized Views[C]//Proceedings ofthe 31st international conference on Very large data bases,2005,Trondheim,Norway:1043-1054
    [127]A Gupta,IS Mumick,VS Subrahmanian Maintaining Views Incrementally[C] //Proceedings ofthe 1993 ACM SIGMOD International Conference on Management ofData,1993,Washington,USA:ACM Press:157-166
    [128]Martin Staudt,Matthias Jarke Incremental Maintenance ofExternally Materialized Views[C]//VLDB’96 Proceedings ofthe 22th InternationalConference on Very Large Data Bases,1996,San Francisco,USA:Morgan Kaufmann Publishers:75.86
    [129]Timothy Griffin,Leonid Libkin Incremental Maintenance ofViews with Duplicates[C]//Proceedings ofthe 1995 ACM SIGMOD international conference on Management ofdata,1995,New-York,USA:ACM Press:328.339
    [130]Jingren Zhou,Per-Ake Larson,Hicham G Elmongui Lazy Maintenance of Materialized Views[C]//Proceedings ofthe 33rd international conference on Very large data bases,2007,Vienna,Austria:VLDB Endowment:23 1-242
    [131]Hoshi Mistry,Prasan Roy,S Sudaxshan,et al Materialized View-Selection and Maintenance Using Multi-Query Optimization[C]//Proceedings ofthe 200 1 ACM SIGMOD international conference on Management of data,200 1,York, USA:ACM Press:307.318
    [132]Rakesh Agrawal,Ashish Gupta,Sunita Sarawagi Modeling Multidimensional Databases[C]//Proceedings of 13th International Conference on Data Engineering,1997,Birmingham,UK:IEEE Computer Society:232-243
    [133]Chang Li,X Sean Wang A Data Model for Supporting On-Line Analytical Processing[C]//Proceedings the fifth Intemational Conference on Information and Knowledege management,1996:81-88
    [134]Marc Gyssens,Laks V S Lakshmanan,Iyer N Subramanian Tables as a Paradigm for Querying and Restructuring[C]//Proceedings ofthe fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles ofdatabase systems,1996,New-York:ACM:93.103
    [135]Marc Gyssens,Laks V S Lakshmanan A Found~ion for Multi-Dimensional Databases[C]//Proceedings ofthe 23rd Intemational Conference on Very Large Data Bases,1997,San Francisco,CA:Morgan Kaufmann:106-115
    [136]Anindya Datta,Helen Thomas The Cube Data Model:A Conceptual Model and Algebra for On-Line Analytical Processing in Data Warehouses[J]Decision Support Systems,1999,27(3):289-301
    [137]Wolfgang Lehner Modeling Large Scale OLAP Scenaxios[C]//Proceedings of the 6th Intemational Conference on Extending Database Techology,1998:153.167
    [138]H V Jagadish,Laks V S Lakshmanan,Divesh Srivastava What can Hierarchies do for Data Warehouses?[C]//Proceedings ofthe 25th Very Large DataBases Conference,1999,Edinburgh,Scotland:530--541
    [139]裴健,柴玮,赵畅,等联机分析处理数据立方体代数[Jll软件学报,1999,10(6):561.569
    [140]李建中,高宏一种数据仓库的多维数据模型[Jll软件学报,2000,11(7):908.917
    [141]Jin Li,David Maier,Kristin Tufce,et al No pane,no gain:efficient evaluation of sliding-window aggregates over data streams[J]ACM SIGMOD Record,2005,34r1、:39-44
    [142]陈秀真,郑庆华,管晓宏,等层次化网络安全成胁态势量化评估方法[Jll软件学报,2006,17(4):885.897
    [143]Kenneth A Ross,Kazi A Zaman Serving Datacube Tup~s from Main Memory[C]//Proceedings ofthe 12th International Conference on Scientific and Statistical Database Management,2000,Berlin,Germany:IEEE Computer Society:182-195
    [144]Goetz Graefe Query Evalution Techniqes for Large Databases[J]ACM Computing Surveys,1993,25(2):73-169

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700