并行计算普适编程模型及系统架构研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

NSTL服务站

并行计算普适编程模型及系统架构研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research on Parallel Computing Universal Programming Model and System Architecture
作者：金晶
论文级别：博士
学科专业名称：计算机科学与技术
中文关键词：并行计算 ; 云计算 ; 海量数据处理 ; 编程模型 ; 系统架构
英文关键词：Parallel Computing ; Cloud Computing ; Massive Data
英文关键词：Processing ; Programming Model ; System Architecture
学位年度：2012
导师：陈山枝
学科代码：0812
学位授予单位：北京邮电大学
论文提交日期：2012-06-27

摘要

信息和数据对于任何一个行业都有着巨大的价值,然而在面对海量数据时,及时的分析和处理却是难题。在过去的十年里,随着各行业信息化程度的提高,数据量的快速增长已在很多行业中出现。为了满足及时分析处理大规模数据的需求,越来越多的领域开始尝试使用并行计算技术。在过去的5到8年的时间里,并行计算编程模型的研究和应用已经从专业领域延伸到IT、电子商务等信息化程度较高的行业。
     并行计算并非一项新技术,从概念的提出到今天已经历了数十年的时间,在很多专业领域已有了相当长的研究历史,并取得了很多的研究成果。但是随着应用领域的改变,技术的使用场景和需求也发生了巨大的变化,目前对基于集群资源的通用并行计算编程模型及系统的研究还很缺乏。随着越来越多行业的加入,人们对普遍适用的并行计算技术的需求会不断增长,这为通用并行计算编程模型及系统的研究带来机遇,同时也带来了挑战。近年来,对通用并行计算的研究已初具规模,提出了很多通用编程模型和系统,例如MapReduce、Dryad等,但是还有很多问题并没有得到解决。
     (1)模型及系统的通用性问题。大部分模型和系统是针对单一问题的需求而提出的,所能够涵盖的问题类型有限,在使用时通常需要对待处理的问题进行转换。并且任务的处理流程已固化在系统设计中,使基于模型的程序设计缺乏灵活性。
     (2)系统的扩展性问题。通用并行系统通常架构在大规模集群之上,但是系统的设计却对资源扩展问题缺乏足够的考虑,随着集群规模的不断扩大和任务量的持续增长,系统的控制核心已出现负载困难的现象。
     (3)通用架构的层次定位问题。虽然以架构去管理资源、用模型去承载任务的设计可以增加集群的通用性,但是对于具体的计算模型而言却没有任何的益处。如果不能将任务的管理流程抽象,架构的通用只能局限在资源配置层次。
     (4)模型应用领域的探寻。通用并行计算在海量数据处理方面展现出的优越性能,使很多问题的解决思路趋向于并行处理,然而并不是所有问题都适合并行处理,模型的应用范围值得思考。
     围绕上述问题,本文开展了以下工作：
     (1)对通用并行计算编程模型及系统展开了研究,对现有模型的优势和不足进行了分析总结,提出了功能并行、时间并行与数据并行分层叠加的并行模式设计,扩展了模型的应用范围。同时在系统设计方面,将应用的执行流程控制作为特殊任务进行设计,使应用的执行流程更多样化,应用程序的设计更灵活。这两方面的设计创新能够增强并行计算编程模型及系统的通用性。
     (2)对通用并行系统的扩展性问题进行了研究,通过分析和总结了现有系统出现扩展性问题的主要原因。提出了分布式多控制点系统架构,用多控制点分布式管理取代单控制点集中式管理,优化了资源信令发送和处理机制、将并发应用的管理和调度拆分到不同控制点上,以此解决由控制点资源有限和任务负载不断增长引起的扩展性问题,从而提升系统的扩展能力。
     (3)对通用系统架构进行了研究,提出了可持续扩展的集群架构。在解决系统扩展性问题的同时,使集群架构设计更适合承载通用并行计算模型。新架构实现了资源管理与任务调度相分离,同时对管理模块进行了层次化设计,使控制层也具有扩展能力。同时新架构还对任务管理进行了抽象,将通用的任务管理功能集成在架构中,而将流程定义及控制预留给任务管理结点实现。
     (4)研究了通用并行计算在解决网络状态分析问题方而的应用。基于通用并行计算系统的特点,对其所适用的问题类型进行了分析。并以流量拥塞调整问题为主要研究对象,进行了并行算法设计,使用并行系统对处理过程进行加速,从而缩短问题处理时间。该研究尝试寻找一条并行处理网络状态分析问题的途径。
Data and Information is enormously valuable for all industries. However, it is quite difficult to process massive data and extract useful information in time. In the past decade, the amount of data has grown rapidly along with the enhancement of informationization in many industries. In order to meet the demands of massive data processing and analyzing, the parallel computing has been introduced into more and more fields. In the past five to eight years, the research and application of parallel computing programming model has been extended from professional field to many information industries, such as IT and e-commerce.
     It is not a novel technology to process data in parallel. The concept of parallelization has been put forward for decades. In many professional fields, there is a long history of research on parallel computing and has got many achievements. However, as the application environment has been different, the scenario and demands has changed greatly. So far, there is a lack of research achievement of general parallel computing programming model and system architecture based on cluster.
     Along with the popularization of parallel computing technique in more and more industries, the demands of general parallel computing programming model are increasing. This is an opportunity as well as challenge for researchers. In recent years, the research of general parallel computing has taken shape, and a great amount of general programming models have been proposed as well as related implementations, such as MapReduce, Dryad and so on. However there are still lots of issues need to be investigated, such as follows:
     (1) The commonality of programming model and related implementation:Most of the programming models and related implementations are designed to meet certain kind of demands, and only a few kinds of data analyzing jobs can be processed directly and efficiently. Meanwhile, as processing modes are integrated in the implementations, the program designs are lack of flexibility.
     (2) The scalability of system:The general parallel computing systems are built on large-scale clusters in usual. However, the system architecture is lack of scalability. As the cluster rapidly expanded and the task size constantly increased, the control unit has been hard to manage the system.
     (3) The design demands of common framework:Although the separation of application management and resource management may enhance the commonality of clusters, it has little benefit to specific programming model. If general process of application management can not be abstracted, the usage of common framework will be limited to resource management.
     (4) The applicability of programming model:As the general parallel computing has done very well in massive data processing, more jobs are trying to be done in parallel. However, the parallel processing is not suitable for all jobs. We should take account of the applicability of the programming models.
     This thesis focuses on above issues and carries out the following researches:
     (1) The research on general parallel computing programming model and system. We analyzed the existing models and summarized the advantages and shortages. Then a new programming model was proposed which integrated three parallel modes on different logical layers, including function parallel, time parallel and data parallel. Meanwhile, we redesigned the architecture of related system. As the process control of application is a special task in the new design, the program design is more flexible. Such design will enhance the commonality of programming model and related system.
     (2) The research on the scalability of general parallel system. Through analyzing the reasons causing bottleneck in scalability, we proposed a new system architecture with distributed multiple masters. In the new system, the distributed management replaced the centralized management; the resource signaling and related processing mechanism were optimized; the managements of concurrent applications would be assigned to different masters. All these new designs may enhance the scalability of system.
     (3) The research on common framework. We proposed a common sustainable scalable framework for general parallel computing system. In the new framework, we not only took account of the scalability of system, but also considered the demands of general parallel computing programming model. In the new framework, we separated application management from resource management, and used hierachical structure to construct scalable management modules. Meanwhile, general functions of application management were integrated into the framework, and the definition of specific processing flow could reload from different programming models.
     (4) The research on the application of general parallel computing is focus on processing data of network states in parallel.We analyzed general parallel computing system and summarized the characteristics of the problems which could be processed in parallel. We also designed a parallel algorithm to solve the problem of traffic congestion. The parallel algorithm might accelerate the process by using parallel system. The target of this reseach is to find a way of analyzing network states in parallel.

引文

[1]Let the number-crunching begin:the Worldwide LHC Computing Grid celebrates first data [EB/OL]. http://www. interactions.org/cms/?pid= 1027032
    [2]Feng Cao, Parallel Data Mining Platform in Telecom Industry [EB/OL]. http://www.slideshare.net/cloudera/hw09-hadoop-based-data-mining-platform-for-the-telecom-indust ry
    [3]Ashish Thusoo, Prasad Chakka, Data Warehousing & Analytics on Hadoop [EB/OL]. http://www.slideshare.net/prasadc/hive-percona-2009
    [4]John F. Gantz, The Diverse and Exploding Digital Universe [EB/OL]. http://www.emc.com/collateral/analyst-reports/diverse-exploding-digital-universe.pdf
    [5]Lxy, IDC研究报告：2011年全球数据总量1.8ZB [EB/OL]. http://blog.sina.com.cn/s/blog_67c4dd8e0100uydx.html
    [6]Tom White, Hadoop:The Definitive Guide [M]. O'REILLY Media, Inc.2009
    [7]Dean J, Ghemawat S. Map-Reduce:simplified data processing on large clusters [C]. Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation, USENIX Association Berkeley, CA, USA,2004
    [8]Ralf Lummel. Google's MapReduce Programming Model — Revisited [J]. Journal Science of Computer Programming, Volume 68 Issue 3, October,2007
    [9]Apache. Hadoop [EB/OL]. http://hadoop.apache.org/.
    [10]David J. Dewitt, Michael Stonebraker, MapReduce:A major step backwards [EB/OL]. http://blog.renren.com/share/225224120/7536411683
    [11]David J. Dewitt, Michael Stonebraker, MapReduce II [EB/OL]. http://duanple.blog.163.com/blog/static/7097176720119361225919/
    [12]Arun C Murthy, The Next Generation of Apache Hadoop MapReduce [EB/OL]. http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/
    [13]Lee K H, Choi H, Moon B, Parallel Data Processing with MapReduce:A Survey [M]. SIGMOD Record, December 2011, Vol 40 No 4
    [14]Lin J, The Curse of Zipf and Linits to Parallelization:A Look at the Stragglers Problem in MapReduce [C]. LSDS-IR'097th Workshop on Large-Scale Distributed Systems for Information Retrieval, Boston USA
    [15]Pavlo A, Paulson E, Rasin A, A Comparison of Approaches to Large-Scale Data Analysis [C]. SIGMOD"09, July 2009, USA
    [16]Rao S, Ramakrishnan R, Ovsiannikov M, Reeves D, SAILFISH:A Framework for Large Scale Data Processing [EB/OL]. http://sailfish.googlecode.com/files/YL-2012-002.pdf
    [17]Chaiken R, Jenkins B, Larson P, Ramsey B, SCOPE:Easy and Efficient Parallel Processing of Massive Data Sets [C]. VLDB-08,34th International Conference on Very Large Data Bases, August 2008, Auckland, New Zealand
    [18]M. Isard, A. Birrell, D. Fetterly, Dryad:Distributed Data-Parallel Programs from Sequential Building Blocks [J], Cluster Computing,2007.
    [19]Daniel Warneke & Odej Kao.Nephele:Efficient Parrallel Data Processing in the Cloud [C].2nd Workshop on Many-Task Computing on Grids and Super Computers, Oregon 2009
    [20]Dominic Battre Nephele/PACTs:A Programming Model and Execution Framework for Web-Scale Analytical Processing [C] ACM Symposium on Cloud Computing,2010, Indiana USA
    [21]Malewicz G, et al. Pregel:a system for large-scale graph processing [C]. PODC'09:Proceedings of the 28th ACM symposium on Principles of distributed computing.2009.
    [22]A. F. Gates, et al.Building a High-Level Dataflow System on top of Map-Reduce:The Pig Experience [C], VLDB'09,35lh International Conference on Very Large Data Bases, August 2009, Lyon France.
    [23]Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao, D. Stott Parker. Map-reduce-merge:simplified relational data processing on large clusters [C]. New York:SIGMOD,2007.
    [24]Y. Bu et al. Haloop:Efficient iterative data processing on large clusters [C]. VLDB'10,36th International Conference on Very Large Data Bases, September 2010, Singapore.
    [25]J. Ekanayake et al. Twister:A runtime for iterative MapReduce [C]. The 19th International ACM Symposium on High-Performance Parallel and Distributed Computing,2010.
    [26]Verma A, Zea N, Cho B, Gupta I, Campbell R H. Breaking the MapReduce Stage Barrier [C].2010 International Conference on Cloud Computing
    [27]Ananthanarayanan G, Kandula S, Greenberg A, Reining in the Outliers in Map-Reduce Clusters using Mantri [EB/OL]. http://wenku.baidu.com/view/d024cccca1c7aa00b52acb5a.html
    [28]Ibrahim S, Jin H, Lu L, Wu S, He B S, LEEN:Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud [C].2nd IEEE International Conference on Cloud Computing Technology and Science.2010
    [29]Kwon Y C, Balazinska M, Howe B, Rolia J, SkewTune:Mitigating Skew in MapReduce Applications [C]. SIGMOD'12, May 2012, Arizona USA
    [30]Condie T, Conway N, Alvaro P, Hellerstein J M, MapReduce Online [EB/OL]. http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-136.pdf
    [31]Kolb L, Thor A, Rahm E, Block-based Load Balancing for Entity Resolution with MapReduce [C]. Proceedings of the 20th ACM Conference on Information and Knowledge Management, October 2011, Glasgow, Scotland, UK.
    [32]Apache, MapReduce [EB/OL]. http://hadoop.apache.org/mapreduce/.
    [1]高兴,并行计算的研究历史[EB/OL].http://www.techcn.com.cn/index.php?edition-view-147139-1
    [2]孙安香,宋君强等,数值气象预报中的并行计算研究[J].高技术通讯,2001年,第12期.
    [3]Dean J, Ghemawat S. Map-Reduce:simplified data processing on large clusters [C]. Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation, USENIX Association Berkeley, CA, USA,2004.
    [4]Mike Burrows, The Chubby lock service for loosely-coupled distributed systems [C]. OSDI'06 Proceedings of the 7th symposium on Operating systems design and implementation, USENIX Association Berkeley, CA, USA,2006
    [5]Apache, Apache Zookeeper [EB/OL]. http://zookeeper.apache.org/
    [6]Apache, Apache Hadoop [EB/OL]. http://hadoop.apache.org/
    [7]Tom White, Hadoop:The Definitive Guide [M]. O'REILLY Media, Inc.2009
    [8]A Venna, N zea et al. Breaking the MapReduce Stage Barrier [C]. IEEE International Conference on Cluster Computing 2010.
    [9]A. F. Gates, et al.Building a High-Level Dataflow System on top of Map-Reduce:The Pig Experience [C], VLDB'09,35th International Conference on Very Large Data Bases, August 2009, Lyon France.
    [10]Apache, Apache Pig [EB/OL]. http://pig.apache.org/
    [11]Rao S, Ramakrishnan R, Ovsiannikov M, Reeves D, SAILFISH:A Framework for Large Scale Data Processing [EB/OL]. http://sailfish.googlecode.com/files/YL-2012-002.pdf
    [12]Chaiken R, Jenkins B, Larson P, Ramsey B, SCOPE:Easy and Efficient Parallel Processing of Massive Data Sets [C]. VLDB'08,34th International Conference on Very Large Data Bases, August 2008, Auckland, New Zealand
    [13]Ananthanarayanan G, Kandula S, Greenberg A, Reining in the Outliers in Map-Reduce Clusters using Mantri [EB/OL]. http://wenku.baidu.com/view/d024ccccalc7aa00b52acb5a.html
    [14]Ibrahim S, Jin H, Lu L, Wu S, He B S, LEEN:Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud.2nd IEEE International Conference on Cloud Computing Technology and Science.2010
    [15]Kwon Y C, Balazinska M, Howe B, Rolia J, SkewTune:Mitigating Skew in MapReduce Applications. SIGMOD'12, May 2012, Arizona USA
    [16]Condie T, Conway N, Alvaro P, Hellerstein J M, MapReduce Online [EB/OL]. http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-136.pdf
    [17]Kolb L, Thor A, Rahm E, Block-based Load Balancing for Entity Resolution with MapReduce. Proceedings of the 20th ACM Conference on Information and Knowledge Management, October 2011, Glasgow, Scotland, UK.
    [18]Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao, D. Stott Parker. Map-reduce-merge:simplified relational data processing on large clusters [C]. New York:SIGMOD,2007.
    [19]Y. Bu et al. Haloop:Efficient iterative data processing on large clusters [C]. VLDB'10,36th International Conference on Very Large Data Bases, September 2010, Singapore.
    [20]M. Isard, A. Birrell, D. Fetterly, Dryad:Distributed Data-Parallel Programs from Sequential Building Blocks [C]. Cluster Computing,2007.
    [21]J. Ekanayake et al. Twister:A runtime for iterative MapReduce [C]. The 191'1 International ACM Symposium on High-Performance Parallel and Distributed Computing,2010.
    [22]Daniel Warneke & Odej Kao. Nephele:Efficient Parrallel Data Processing in the Cloud [C]. MTAGS Oregon 2009.
    [23]Dominic Battre Nephele/PACTs:A Programming Model and Execution Framework for Web-Scale Analytical Processing [C]. ACM Symposium on Cloud Computing,2010, Indiana USA.
    [24]B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica. Mesos:A platform for fine-grained resource sharing in the data center [M]. Technical Report UCB/EECS-2010-87, EECS Department, University of California, Berkeley, May 2010.
    [25]Apache, Apache Mesos [EB/OL]. http://www.mesosproject.org/.
    [26]A.C. Murthy. The Next Generation of Apache Hadoop MapReduce [EB/OL]. http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/.
    [27]李能贵,电子元器件的可靠性[M].西安交通大学出版社,1990.
    [28]孙青,庄弈琪,王锡吉,刘发.电子元器件可靠性工程[M].电子工业出版社,2002.
    [1]Dean J, Ghemawat S. Map-Reduce:simplified data processing on large clusters [C]. Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation, USENIX Association Berkeley, CA, USA,2004
    [2]Apache, MapReduce [EB/OL]. http://hadoop.apache.org/mapreduce/.
    [3]Ralf Lammel. Google's MapReduce Programming Model — Revisited [J]. Journal Science of Computer Programming, Volume 68 Issue 3, October,2007
    [4]Blaise Barney. Introduction to Parallel Computing [EB/OL]. https://computing.llnl.gov/tutorials/parallel_comp/
    [5]Blanas S, Patel M J, Ercegovac V, Rao J. A Comparison of Join Algorithms for Log Processing in MapReduce. SIGMOD 2010, Indiana USA
    [6]Kwon Y C, Balazinska M, Rolia J. Skew-Resistant Parallel Processing of Feature-Extracting Scientific User-Defined Functions [C]. ACM Symposium on Cloud Computing,2010, Indiana, USA
    [7]Okcan A, Riedewald M, Processing Theta-Joins using MapReduce. SIGMOD" 11, June 2011, Athens Greece
    [8]Gillick D, Faria A, Denero J, MapReduce:Distributed Computing for Machine Learning [EB/OL]. http://www.seas.harvard.edu/courses/cs181/docs/gillick_cs262a_proj.pdf
    [9]M. Isard, A. Birrell, D. Fetterly, Dryad:Distributed Data-Parallel Programs from Sequential Building Blocks [C]. Cluster Computing,2007.
    [10]Daniel Warneke & Odej Kao. Nephele:Efficient Parrallel Data Processing in the Cloud [C]. MTAGS Oregon 2009.
    [11]Dominic Battre Nephele/PACTs:A Programming Model and Execution Framework for Web-Scale Analytical Processing [C]. ACM Symposium on Cloud Computing,2010, Indiana USA.
    [12]Yahoo. Oozie [EB/OL]. http://yahoo.github.com/oozie/
    [13]Y. Bu et al. Haloop:Efficient iterative data processing on large clusters [C]. VLDB'10,36th International Conference on Very Large Data Bases, September 2010, Singapore.
    [14]J. Ekanayake et al. Twister:A runtime for iterative MapReduce [C]. The 19th International ACM Symposium on High-Performance Parallel and Distributed Computing,2010.
    [15]Hung-chih Yang, Ali Dashan, Ruey-Lung Hsiao, D. Stott Parker. Map-reduce-merge:simplified relational data processing on large clusters [C]. Proceedings of the 2007 ACM SIGMOD international conference on Management of data, New York, NY, USA 2007
    [16]Apache. Hadoop HDFS [EB/OL]. http://hadoop.apache.org/hdfs.
    [17]Apache. Hadoop [EB/OL]. http://hadoop.apache.org/.
    [18]B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica. Mesos:A platform for fine-grained resource sharing in the data center [M]. Technical Report UCB/EECS-2010-87, EECS Department, University of California, Berkeley, May 2010.
    [19]Apache, Apache Mesos [EB/OL]. http://www.mesosproject.org/.
    [20]Condie T, Conway N, Alvaro P, Hellerstein J M, MapReduce Online [EB/OL]. http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-136.pdf
    [21]A. F. Gates, et al.Building a High-Level Dataflow System on top of Map-Reduce:The Pig Experience [C], VLDB'09,35th International Conference on Very Large Data Bases, August 2009, Lyon France.
    [22]Malewicz G, Austern M H, Bik A J C, et al. Pregel:A System for Large-Scale Graph Processing. SIGMOD'10, June 2010, Indiana USA
    [23]Kwon Y C, Balazinska M, Howe B, Rolia J, SkewTune:Mitigating Skew in MapReduce Applications. SIGMOD'12, May 2012, Arizona USA
    [1]Dean J, Ghemawat S. Map-Reduce:simplified data processing on large clusters [C]. Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation, USENIX Association Berkeley, CA, USA,2004
    [2]Ralf Lammel. Google's MapReduce Programming Model — Revisited [J]. Journal Science of Computer Programming, Volume 68 Issue 3, October,2007
    [3]S. Chen. Cheetah:a high performance, custom data warehouse on top of MapReduce [C]. VLDB'10, 36th International Conference on Very Large Data Bases, Singapore,3(1-2):1459-1468,2010.
    [4]F.N. Afrati, J.D. Ullman. Optimizing joins in a map-reduce environment [C]. In Proceedings of the 13th EDBT, pp.99-110,2010.
    [5]D. Jiang et al. Map-join-reduce:Towards scalable and efficient data analysis on large clusters [J]. IEEE Transactions on Knowledge and Data Engineering, vol.23 no.9, pp.1299-1311,2010.
    [6]Arun C Murthy, The Next Generation of Apache Hadoop MapReduce [EB/OL]. http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/
    [7]A. Anand. Scaling Hadoop to 4000 nodes at Yahoo! [EB/OL] http://developer.yahoo.com/blogs/hadoop/posts/2008/09/scaling_hadoop_to_4000_nodes_a/,2008
    [8]Apache, MapReduce [EB/OL]. http://hadoop.apache.org/mapreduce/
    [9]Apache. Hadoop [EB/OL]. http://hadoop.apache.org/.
    [10]T. White. Hadoop:The Definitive Guide [M]. Yahoo Press,2010.
    [11]B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica. Mesos:A platform for fine-grained resource sharing in the data center [M]. Technical Report UCB/EECS-2010-87, EECS Department, University of California, Berkeley, May 2010.
    [12]Apache, Apache Mesos [EB/OL]. http://www.mesosproject.org/.
    [13]M. Isard et al. Dryad:distributed data-parallel programs from sequential building blocks [C]. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, pp.59-72,2007.
    [14]Apache, ZooKeeper [EB/OL]. http://hadoop.apache.org/zookeeper/.
    [1]S. Chen. Cheetah:a high performance, custom data warehouse on top of MapReduce [C]. VLDB'10, 36th International Conference on Very Large Data Bases, Singapore,3(1-2):1459-1468,2010.
    [2]F.N. Afrati, J.D. Ullman. Optimizing joins in a map-reduce environment [C]. In Proceedings of the 13th EDBT,pp.99-110,2010.
    [3]D. Jiang et al. Map-join-reduce:Towards scalable and efficient data analysis on large clusters [J]. IEEE Transactions on Knowledge and Data Engineering, vol.23 no.9, pp.1299-1311,2010.
    [4]Dean J, Ghemawat S. Map-Reduce:simplified data processing on large clusters [C]. Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation, USENIX Association Berkeley, CA, USA,2004
    [5]Ralf Lammel. Google's MapReduce Programming Model — Revisited [J]. Journal Science of Computer Programming, Volume 68 Issue 3, October,2007
    [6]Apache, MapReduce [EB/OL]. http://hadoop.apache.org/mapreduce/
    [7]T. White. Hadoop:The Definitive Guide [M]. Yahoo Press,2010.
    [8]Arun C Murthy, The Next Generation of Apache Hadoop MapReduce [EB/OL]. http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/
    [9]Y. Bu et al. Haloop:Efficient iterative data processing on large clusters [C]. VLDB'10,36th International Conference on Very Large Data Bases, September 2010, Singapore.
    [10]M. Isard, A. Birrell, D. Fetterly, Dryad:Distributed Data-Parallel Programs from Sequential Building Blocks [C]. Cluster Computing,2007.
    [11]Daniel Warneke & Odej Kao. Nephele:Efficient Parrallel Data Processing in the Cloud [C]. MTAGS Oregon 2009.
    [12]Dominic Battre Nephele/PACTs:A Programming Model and Execution Framework for Web-Scale Analytical Processing [C]. ACM Symposium on Cloud Computing,2010, Indiana USA.
    [13]B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica. Mesos:A platform for fine-grained resource sharing in the data center [M]. Technical Report UCB/EECS-2010-87, EECS Department, University of California, Berkeley, May 2010.
    [14]Apache, Apache Mesos [EB/OL]. http://www.mesosproject.org/.
    [15]Rao S, Ramakrishnan R, Ovsiannikov M, Reeves D, SAILFISH:A Framework for Large Scale Data Processing [EB/OL]. http://sailfish.googlecode.com/files/YL-2012-002.pdf
    [16]Chaiken R, Jenkins B, Larson P, Ramsey B, SCOPE:Easy and Efficient Parallel Processing of Massive Data Sets [C]. VLDB'08,34th International Conference on Very Large Data Bases, August 2008, Auckland, New Zealand
    [17]Condie T, Conway N, Alvaro P, Hellerstein J M, MapReduce Online [EB/OL]. http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-136.pdf
    [18]Malewicz G, Austern M H, Bik A J C, et al. Pregel:A System for Large-Scale Graph Processing. SIGMOD'10, June 2010, Indiana USA
    [1]S. Rai, B. Mukherjee, and O. Deshpande, IP resilience within an autonomous system:Current approaches, challenges, and future directions [J], IEEE Communications Magazine, vol.43, no.10, pp.142-149, oct 2005.
    [2]A. Markopoulou et al., Characterization of Failures in an IP Backbone [C], Proc. INFOCOM, Mar. 2004.
    [3]A. Nucci et al., IGP Link Weight Assignment for Transient Link Failures [J], Elsevier ITC 18,2003.
    [4]朱慧玲,杭大明,马正新,曹志刚,李安国.QoS路由选择：问题与解决方法综述[J].电子学报,vol.31 No.1,Jan 2003.
    [5]Kuipers F, Mieghem P V, Korkmaz T, Krunz M. An Overview of Constraint-Based Path Selection Algorithms for QoS Routing [J]. IEEE Communication Magazine, December 2002.
    [6]Xue G L, Sen A, Zhang W, et al. Finding a Path Subject to Many Additive QoS Constraints [J]. IEEE/ACM Transactions on Networking, Vol 15, No 1, February 2007.
    [7]A. Orda, Routing with end-to-end QoS guarantees in broadband networks [J], IEEE/ACM Transaction on Networking, vol.7, no.3, pp.365-374, Junary.1999.
    [8]A. Orda and A. Sprintson, Precomputation schemes for QoS routing [J], IEEE/ACM Transaction on Networking, vol.11, no.4, pp.578-591, August.2003.
    [9]R. K. Ahuja, T. L. Magnanti, J. B. Orlin. Network Flows:Theory, Algorithms, and Applications [M]. Prentice Hall,1993.
    [10]P. Key, L. Massoulie, and D. Towsley. Path selection and multipath congestion control [C]. In Proc. IEEE INFOCOM,2007.
    [11]M. Mitzenmacher. The power of two choices in randomized load balancing [J]. IEEE Transactions on Parallel and Distributed Systems,12(10):1094-1104,2001.
    [12]M Yu, Y Yi, J Rexford, M Chiang, Rethinking Virtual Network Embedding:Substrate Support for Path Splitting and Migration [M]. ACM SIGCOMM Computer Communication Review.38(2):17-29, April 2008
    [13]王洁,最小生成树的算法[EB/OL].http://wenku.baidu.com/view/6e98672de2bd960590c67771.html
    [14]许智祺,Dijkstra算法[EB/OL]. http://baike.baidu.com/history/id=24972713
    [15]最大流最小割定理[EB/OL]. http://wenku.baidu.com/view/a624bb3f0912a21614792944.html
    [16]Mechthild Stoer, Frank Wagner. A Simple Min-Cut Algorithm [J]. Journal of the ACM. Volume 44 Issue 4, July 1997, pp.581-591.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700