用户名: 密码: 验证码:
代价高效的容错片上网络关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
未来,单芯片集成的处理器数量将会达到数百甚至数千,处理器之间的通信量非常庞大。传统片上互连结构的可扩展性太差,无法满足多核芯片的通信需求。此外,随着CMOS特征尺寸的缩小,门延时明显降低,而线延时降低的幅度远低于门延时,导致线延时超过门延时。因此,必须精心设计全局互连线或者避免使用全局互连线。片上网络采用局部链路代替全局互连线,具有很好可扩展性,能够满足片上多核处理器通信需求。大于10%的晶体管可能因为工艺偏差等原因而发生硬故障。随着CMOS工艺特征尺寸的缩小,连线宽度变小,发生硬故障的概率增大。硬故障可能导致片上网络瘫痪。现有片上网络仅能容忍少量的硬故障,并且容错粒度较粗导致面积开销大。因此,设计代价高效的容错片上网络成为片上网络研究的重大问题之一。为提高片上网络的容错能力,本文从实现硬件级模拟开发平台着手,深入研究了路由器结构、容错路由算法、细粒度容错片上网络结构和容错任务映射算法,主要取得了如下研究成果:
     1.设计了一种面向片上网络的硬件级模拟开发平台HardSim。它集模拟和设计验证于一体,能够执行硬件级模拟,比微片级模拟描述更详细的硬件特征。它支持人工合成负载和真实应用程序踪迹模拟。它实现了两种故障注入模式,静态故障注入和动态故障注入。静态故障在模拟启动前产生并且载入网络。动态故障是在模拟过程动态产生并注入网络。
     2.提出了一种低延时共享输出缓存路由器SOBR。它具有5个重要特征:(1)虚通道(VC)位于输出端口,而不是输入端口;(2)虚通道交换器动态配置访问阵列,提高了可用缓存容量,提升了路由器性能;(3)支持跳步读操作的动态FIFO缓存结构,减少报文阻塞;(4)采用动态分层交换提升路由器性能;(5)所有类型的微片通过路由器的最小延时均为一个时钟周期。基于65nm标准单元库的综合结果表明,SOBR路由器的关键路径仅为24个逻辑门,延时约为0.64ns。由于流水线仅为1个周期,SOBR的平均延时显著低于其他路由器。在4×4mesh和均匀随机通信模式下,SOBR饱和吞吐率高达0.86微片/结点/周期。由于省略了VC分配器和交叉开关分配器等模块,SOBR的面积开销比相同缓存容量的经典输入虚通道路由器减少了9.4%。此外,定性分析结果表明,VC交换器有效提高了SOBR的容错能力。
     3.提出了一种高效的分布式容错路由算法PR-WF,并且将该算法用于SOBR路由器。PR-WF以西向优先转向模型为基础,采用动态伪接收(DPR)机制,动态启动或关闭向西转向,并且避免网络发生死锁。所谓DPR机制,指的是本地网络接口接收向西转向的报文并且将其转发到西向端口。本地网络接口需要FIFO缓存以存储向西转向报文。PR-WF采用特定优先权原则,为每个端口生成多条优先权队列,并且根据网络链路和邻居路由器端口状态,产生输出端口。PR-WF是一种基于逻辑的分布式容错路由算法,其面积开销远低于基于路由表的容错路由算法。PR-WF算法与网络尺寸无关,因此,具有更好的可扩展性。对于10%的链路故障率,PR-WF仅需废弃1.8%的完好链路就可以避免发生活锁。对于链路故障率为10%的9×9mesh,PR-WF平均跳步次数比最短路径仅增加了8.34%。综上,PR-WF路由算法是一种高效的分布式容错路由算法
     4.提出并实现了一种细粒度容错片上网络结构SNoC。SNoC在通道切割的基础上,通过切片接口部件将切片之间相互耦合,使得网络能够以细粒度容忍硬故障。每个路由器包含4个切片,每条链路包含4个子链路和1个备份子链路。SNoC采用一种自适应切片接口部件,能够根据切片和子链路的状态为切片接口提供优化配置。切片均采用SOBR结构和PR-WF容错路由算法。模拟与分析结果表明,SNoC结构大幅减少了有效故障数量。即使在故障率较高的情况下,SNoC结构也取得很好的性能。基于65nm的综合结果表明,SNoC面积开销相比基于通道切割的片上网络增加了约1%。
     5.提出了一种低开销容错任务映射算法CMAP。现有任务映射算法大都是基于搜索的方法,速度慢,可扩展性差。构造算法根据最优解的特征,从无到有地构造近似最优解。优点是时间复杂度小,运行速度快。CMAP是一种面向任务映射的构造任务映射算法,能够感知拓扑结构,通过构造链表尽可能将权重较大的边映射到单跳步路由路径,或者优先将度数较大的结点映射到局部最优位置,解决了不规则拓扑的任务映射问题。通过两种真实应用和多种任务图对该算法进行了评估,证明了CMAP算法具有较高的准确性、效率、扩展性和容错能力。
In future, the number of processors in a single chip will reach hundreds or eventhousands, and communication bandwidths between processors are very large.Traditional on-chip interconnection methods are unable to meet the communicationrequirements of multi-core chips, because of their poor scalabilities. As the feature sizeof COMS technology shrinking, gate delay are significantly reduced, and reductions ofwire delay are much lower than the gate delay, so wire delay are bigger than gate delay.Therefore, the global interconnection has to be designed carefully or discarded.Networks-on-Chip (NoCs) use local links instead of gloable interconnection wires, andhas good scalability to meet the needs of multi-core communication requirements.
     More than10%transistors may be faults because of technology variations or otherreasions. As the COMS technology feature size shrinks, wire widths become smaller,and the probability of hard faults is also increased. Hard faults may lead to NoCsparalysis. Existing NoCs are only able to tolerant small amount of faults, and theirfault-tolerant grains are coarse resulting in large area overheads. The design of alow-overhead fault-tolerant NoC has been a major challenge. In order to improve theability of toleranting hard faults of NoCs, this paper started from the implementation ofa hardware-level simulation and design platform, researched router architecture,fault-tolerant routing algorithm, fine-grain network architecture and task mappingalgorithm deeply, and achieved achievements as following:
     1. A hardware-level on-chip network simulation and design platform (i.e.HardSim) is designed. It is able to perform hardware-level simulation whichdescribes more hardware details than flit-level simulations. It combinessimulation and design verification in a unified flow. It supports simulations oftrace-based real applications. It implements two kinds of fault injectionpatterns, static fault injections and dynamic fault injections. Static faults areproduced and loaded into networks before the start of simulations. Dynamicfaults are dynamically produced and loaded into networks among simulations.
     2. A low-latency shared output buffered router, named SOBR, is presented.SOBR has5important features:(1) its virtual-channels locate in output ports,rather than input ports;(2) the dynamically configuration of access matrixes byvirtual-channel swapper is effective to improve the capacity of availablebuffers and the performance of networks;(3) its dynamic FIFO bufferarchitecture supports leap read operation to reduce packet blockings;(4) adynamic layered switching is taken to improve performances of networks;(5)all types of flits can pass through a router in one clock cycle ideally. Under65nm synthesized results show that its critical path is only24logic gates, and its worst delay is about0.64ns. Owing to its single-cycle pipeline, averagelatencies of SOBR are clearly lower than other routers. For4×4mesh and theuniform random traffic pattern, the maximum saturation throughput of SOBRis up to0.86flits per node per cycle. Owing to the elimination of VCallocation, switch allocation and switch modules, the area overhead of SOBRreduces up to9.4%when compared with the input virtual-channel router of thesame buffer. Qualitative analysis showed that the virtual-channel swapper iseffective to enhance the fault-tolerant ability of SOBR.
     3. A low-overhead distributed fault-tolerant routing algorithm, i.e. PR-WR, ispresented and integrated into SOBR router. It is based on the turn model of thewest-first routing, and takes a dynamic pseudo receiving (DPR) mechanism toenable or disenable west turns, and can ensure networks to be deadlock-free.The DPR mechanism refers to that local network interfaces receive west-turnpackets temporarily and then forward them to west ports as soon as possible.In order to store west-turn packets, each local network interface has a FIFObuffer queue for each west turns. DPR mechanism can turn off west turns toavoid deadlock. PR-WF chooses a suitable output port according to the state ofoutput links and neighbor routers by the specific principle of priorities. It is alogic-based distributed fault-tolerant routing, and its area overhead is muchlower than table-based fault-tolerant routings. It has nothing to do with thenetwork size, so it has good scalability. In order to avoid livelock, it onlyneeds to disable1.8%good links under10%link fault rate. For9×9mesh with10%faulty links, its average hop number only increases by8.34%than theshortest path. In summary, PR-WF is an efficient low-overhead distributedfault-tolerant routing.
     4. A low-overhead fault-tolerant NoC architecture, i.e SNoC, is presented. SNoCis base on channel slicing, and couples all slices by slice interfaces whichmake networks to tolerant faults in fine granularities. Each router has4slices,and each link has5sub-links. Its slice interface is self-reconfigurable toprovide optimal configurations according to states of slices and links. Its sliceuses SOBR architecture and PR-WF fault-tolerant routing. Simulation resultsshow that, SNoC is able to achieve good performance even under high faultyrates. Under65nm synthesized results show that its router critical pathsincrease only0.08ns than SOBR router. Compared with the network ofchannel slicing, its area overhead only increases about1%than the channelslicing network.
     5. A low-overhead fault-tolerant task mapping algorithm, i.e. CMP, is proposed.Existing task mapping algorithms are based on search method, which are time-consuming and have poor scalability. Construction algorithms graduallyconstruct sub-optimal solutions according to characteristics of optimalsolutions. It is of low complexities, and run faster than search algorithms.Therefore, a construction algorithm, named CMAP, is present for taskmapping problem. It is topology-awared to resolve task mapping problem forirregular topologies. It maps large weighted links of the task graph to singlehop routing as much as possible, or maps large degree node of the task graphto local optimal positions. Two real applications and a variety of task graphsare used to verify its accuracy, efficiency, scalability and fault-tolerant ability.
引文
[1]Borkar S. Thousand core chips:a technology perspective [C]. In Proceedings of the44th Annual Design Automation Conference. New York, NY, USA, Sep2007:746-749.
    [2]Owens J D, Dally W J, Ho R, et al. Research Challenges for On-Chip Interconnection Networks [J]. IEEE Micro.2007,27(5):96-108.
    [3]Andre I, Giovanni D M. Guest Editors'Introduction:The Network-on-Chip Paradigm in Practice and Research [J]. IEEE Design and Test of Computers.2005,22:399-403.
    [4]Meincke T, Hemani A, Kumar S, et al. Globally asynchronous locally synchronous architecture for large high-performance ASICs.[C]. In Proceedings of IEEE International Symposium on Circuits and Systems. Orlando, Florida, USA, May1999:512-515.
    [5]Stephen W K, Doug B, Charles R M, et al. A Wire-Delay Scalable Microprocessor Architecture for High Performance Systems [C]. In Proceedings of the2003International Solid-State Circuits Conference. San Francisco, USA, February2003:168-169.
    [6]Ahmed H, Axel J, Shashi K, et al. Network on chip:An architecture for billion transistor era [C]. In Proceedings of18th conference of Norchip. Turku, November2000:1-8.
    [7]Guerrier P, Greiner A. A generic architecture for on-chip packet-switched interconnections [C]. In Proceedings of the conference on Design, automation and test in Europe. New York, NY, USA,2000:250-256.
    [8]Wiklund D, Liu D. Switched Interconnect for System-on-a-Chip Designs [J]. Electrical and Electronic Engineering.2000,1(6):1-6.
    [9]Dally W J, Towles B. Route packets, not wires:on-chip interconnection networks [C]. In Proceedings of the38th annual Design Automation Conference. New York, NY, USA,2001:684-689.
    [10]A Network on Chip Architecture and Design Methodology [C]. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI. Washington, DC, USA,2002:105-112.
    [11]Benini L, De Micheli G. Networks on Chips:A New SoC Paradigm [J]. Computer.2002,35(1):70-78.
    [12]Murali S. Designing Reliable and Efficient Networks on Chips [M]. Germany: Springer Science,2009:1-189.
    [13]Penolazzi S, Jantsch A. A High Level Power Model for the Nostrum NoC [C]. In Proceedings of the9th EuroMicro Conference on Digital System Design. Washington, DC, USA,2006:673-676.
    [14]Rijpkema E, Goossens K G W, Radulescu A, et al. Trade Offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip [C]. In Proceedings of the conference on Design, Automation and Test in Europe-Volume1. Washington, DC, USA,2003:294-302.
    [15]Bjerregaard T. The MANGO clockless network-on-chip:Concepts and implementation [D]. Richard Petersens Plads:Informatics and Mathematical Modelling, Technical University of Denmark, DTU,2005.
    [16]Jalabert A B L D M G, Murali S. Xpipes Compiler:A Tool for Instantiating Application Specific Networks on Chip [C]. In Proceedings of Design, Automation and Test in Europe Conference and Exposition. Grenoble, France, March2004:884-889.
    [17]Jantsch A, Tenhunen H. Networks on Chip [M]. Netherlands:Kluwer Academic,2003:1-303.
    [18]Jose D, Sudhakar Y, Lionel N. Interconnection Networks:An Engineering Approach [M]. San Francisco:Morgan Kaufmann,2003:1-592.
    [19]Nurmi J, Tenhunen H, Jantsch A. Interconnect-Centric Design for Advanced SoC and NoC [M]. Netherlands:Kluwer Academic,2004:1-453.
    [20]William D, James, Brian T. Principles and Practices of Interconnection Network [M]. San Francisco, USA:Morgan Kaufmann,2004:1-509.
    [21]Luca B, Giovanni D M. Networks-on-Chips:Technology and Tools [M]. Bologna, Italy:Elsevier,2006:1-383.
    [22]Hoi-Jun Yoo K L, Kim J K. Low-Power NoC for High-Performance SoC Design [M]. New York:CRC Press,2008:1-281.
    [23]Hui Z, Vandana P, Varghese G, et al. A1V Heterogeneous Reconfigurable Processor IC for Baseband Wireless Applications [J]. IEEE Journal on Solid State Circuits.2000,35(11):1697-1704.
    [24]Taylor M B, Kim J, Miller J, et al. The Raw Microprocessor:A Computational Fabric for Software Circuits and General-Purpose Programs [J]. IEEE Micro.2002,22(2):25-35.
    [25]Se-Joong L, Seong-Jun S, Kangmin L, et al. An800MHz star-connected on-chip network for application to systems on a chip [C]. In Porceedings of the IEEE International Solid State Circuits Conference. San Francisco, CA, February2003:468-469.
    [26]Lee K, Lee S-j, Kim S-e, et al. A51mW1.6GHz on-chip network for low-power heterogeneous SoC platform [C]. In Proceedings of IEEE International Solid State Circuits Conference. San Francisco, CA,2004:152-161.
    [27]Se-Joong L, Kwanho K, Hyejung K, et al. Adaptive network-on-chip with wavefront train serialization scheme [C]. In Proceedings of the Symposium on VLSI Circuits. Kyoto, Japan,2005:104-107.
    [28]Kim D, Lee K, Lee S-J, et al. A reconfigurable crossbar switch with adaptive bandwidth control for networks-on-chip.[C]. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS). Kobe, Japan,2005:2369-2372.
    [29]Kim D, Kim K, Kim J-Y, et al. Solutions for Real Chip Implementation Issues of NoC and Their Application to Memory-Centric NoC [C]. In Proceedings of the First International Symposium on Networks-on-Chip. Washington, DC, USA,2007:30-39.
    [30]Butts M. Synchronization through Communication in a Massively Parallel Processor Array [J]. IEEE Micro.2007,27(5):32-40.
    [31]Wentzlaff D, Griffin P, Hoffmann H, et al. On-Chip Interconnection Architecture of the Tile Processor [J]. IEEE Micro.2007,27(5):15-31.
    [32]Gratz P, Kim C, Sankaralingam K, et al. On-Chip Interconnection Networks of the TRIPS Chip [J]. IEEE Micro.2007,27(5):41-50.
    [33]Hoskote Y, Vangal S, Singh A, et al. A5-GHz Mesh Interconnect for a Teraflops Processor [J]. IEEE Micro.2007,27(5):51-61.
    [34]Bradley T. Intel48-Core "Single-Chip Cloud Computer" Improves Power Efficiency [R].2009. http://www.pcworld.com/businesscenter/article/183653/.
    [35]Borkar S. Microarchitecture and Design Challenges for Gigascale Integration [C]. In Proceedings of the37th annual IEEE/ACM International Symposium on Microarchitecture. Washington, DC, USA,2004:3-3.
    [36]Pasca V, Anghel L, Rusu C, et al. Configurable serial fault-tolerant link for communication in3D integrated systems [C]. In Proceedings of the2010IEEE16th International On-Line Testing Symposium. Washington, DC, USA,2010:115-120.
    [37]Wang H, Peh L-S, Malik S. Power-driven Design of Router Micro-architectures in On-chip Networks [C]. In Proceedings of the36th annual IEEE/ACM International Symposium on Micro-architecture. Washington, DC, USA,2003:105-116.
    [38]Dally W J, Seitz C L. The Torus Routing Chip [J]. Distributed Computing.1986.
    [39]Karim F, Nguyen A, Dey S. An Interconnect Architecture for Networking Systems on Chips [J]. IEEE Micro.2002,22(5):36-45.
    [40]Marcello C, Riccardo L, Giuseppe M, et al. Spidergon:a novel on-chip communication network [C]. In Proceedings of International Symposium on System-on-Chip. Tampere, Finland, November2004:15-21. Topology for Local Traffic NOCs [C]. In Proceedings of2005IEEE International Symposium on Circuits and Systems. Kos, Greece, May2006:21-24.
    [42]Murali S, De Micheli G. SUNMAP:a tool for automatic topology selection and generation for NoCs [C]. In Proceedings of the41st annual Design Automation Conference. New York, NY, USA,2004:914-919.
    [43]张恒龙,顾华玺.片上网络拓扑结构的研究[J].中国集成电路.2007,16(11):42-46.
    [44]Salminen E, Kangas T, Lahtinen V, et al. Benchmarking mesh and hierarchical bus networks in system-on-chip context [J]. Journal of System Architecture.2007,53(8):477-488.
    [45]Kim M M, Davis J D, Oskin M, et al. Polymorphic On-Chip Networks [C]. In Proceedings of the35th Annual International Symposium on Computer Architecture. Washington, DC, USA,2008:101-112.
    [46]Modarressi M, Sarbazi-Azad H. Power-aware mapping for reconfigurable NoC architectures [C]. In Proceedings of the25th International Conference on Computer Design. Lake Tahoe, CA, September2007:417-422.
    [47]Bartic T A, Mignolet J Y, Nollet V, et al. Topology adaptive network-on-chip design and implementation [J]. IEE Proceedings Computers and Digital Techniques.2005,152(4):467-472.
    [48]Stensgaard M B, Spars J. ReNoC:A Network-on-Chip Architecture with Reconfigurable Topology [C]. In Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC, USA,2008:55-64.
    [49]Al Faruque M A, Ebi T, Henkel J. Configurable links for runtime adaptive on-chip communication [C]. In Proceedings of the Conference on Design, Automation and Test in Europe. Leuven, Belgium, Belgium,2009:256-261.
    [50]Bobda C, Majer M, Koch D, et al. A Dynamic NoC Approach for Communication in Reconfigurable Devices [C]. In Proceedings of the14th International Conference on Field Programmable Logic and Application. Leuven, Belgium, August2004:1032-1036.
    [51]Rana V, Atienza D, Santambrogio M, et al. A Reconfigurable Network-on-Chip Architecture for Optimal Multi-Processor SoC Communication [C]. In16th Annual IFIP/IEEE International Conference on Very Large Scale Integration (VLSISoC). Rhodes Island, Greece, October2008:321-326.
    [52]Krasimirov N. A run-time reconfigurable network-on-chip for streaming DSP applications [D]. Enschede:University of Twente,2006:1-98.
    [53]Ahmad B, Erdogan A T, Khawam S. Architecture of a Dynamically Reconfigurable NoC for Adaptive Reconfigurable MPSoC [C]. In Proceedings of the first NASA/ESA conference on Adaptive Hardware and Systems. Washington, DC, USA,2006:405-411.
    [54]Ankur A, Cyril I, Ravi S. Survey of Network on Chip (NoC) Architectures&Contributions [J]. Journal of Engineering Computing and Architecture.2009,3(15):1-15.
    [55]Chang K C, Shen J S, Chen T F. Evaluation and design trade-offs between circuit switched and packet-switched NOCs for application-specific SOCs [C]. In Proceedings of the43rd annual Design Automation Conference. New York, NY, USA,2006:143-148.
    [56]Kumar A, Peh L-S, Kundu P, et al. Express virtual channels:towards the ideal interconnection fabric [C]. In Proceedings of the34th annual international symposium on Computer architecture. New York, NY, USA,2007:150-161.
    [57]Lu Z, Liu M, Jantsch A. Layered switching for networks on chip [C]. In Proceedings of the44th annual Design Automation Conference. New York, NY, USA,2007:122-127.
    [58]Tamhankar R R, Murali S, De Micheli G. Performance driven reliable link design for networks on chips [C]. In Proceedings of the2005Asia and South Pacific Design Automation Conference. New York, NY, USA,2005:749-754.
    [59]Bolotin E, Cidon I, Ginosar R, et al. QNoC:QoS architecture and design process for network on chip [J]. Journal of System Architecture.2004,50(2-3):105-128.
    [60]Bijlsma B. Asynchronous network-on-chip architecture performance analysis [D]. Netherlands:Delft,2004:1-89.
    [61]Pullini A, Angiolini F, Bertozzi D, et al. Fault tolerance overhead in network-on-chip flow control schemes [C]. In Proceedings of the18th annual symposium on Integrated circuits and system design. New York, NY, USA,2005:224-229.
    [62]Ramanujam R S, Soteriou V, Lin B, et al. Design of a High-Throughput Distributed Shared-Buffer NoC Router [C]. In Proceedings of the2010Fourth ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC, USA,2010:69-78.
    [63]Mullins R, West A, Moore S. Low-Latency Virtual-Channel Routers for On-Chip Networks [C]. In Proceedings of the31st annual international symposium on Computer architecture. Washington, DC, USA,2004:188-197.
    [64]Nicopoulos C A, Park D, Kim J, et al. ViChaR:A Dynamic Virtual Channel Regulator for Network-on-Chip Routers [C]. In Proceedings of the39th Annual IEEE/ACM International Symposium on Microarchitecture. Washington, DC, USA,2006:333-346.
    [65]Mullins R, West A, Moore S. The design and implementation of a low-latency on-chip network [C]. In Proceedings of the2006Asia and South Pacific Design Automation Conference. Piscataway, NJ, USA,2006:164-169.
    [66]Peh L-S, Dally W J. A Delay Model and Speculative Architecture for Pipelined Routers [C]. In Proceedings of the7th International Symposium on High-Performance Computer Architecture. Washington, DC, USA,2001:255-266.
    [67]Prakash A, Aziz A, Ramachandran V. Randomized Parallel Schedulers for Switch-Memory-Switch Routers:Analysis and Numerical Studies [C]. In IEEE INFOCOM. Hong Kong, China, March2004:2026-2037.
    [68]Soteriou V, Sunkam Ramanujam R, Lin B, et al. A High-Throughput Distributed Shared-Buffer NoC Router [J]. IEEE Comput. Archit. Lett.2009,8(1):21-24.
    [69]Abad P, Puente V, Gregorio J A, et al. Rotary router:an efficient architecture for CMP interconnection networks [C]. In Proceedings of the34th annual international symposium on Computer architecture. New York, NY, USA,2007:116-125.
    [70]Tamir Y, Frazier G L. High-performance multi-queue buffers for VLSI communications switches [C]. In Proceedings of the15th Annual International Symposium on Computer architecture. Los Alamitos, CA, USA,1988:343-354.
    [71]Liu J, Delgado-Frias J G. A DAMQ shared buffer scheme for network-on-chip [C]. In Proceedings of the Fifth IASTED International Conference on Circuits, Signals and Systems. Anaheim, CA, USA,2007:53-58.
    [72]Hu X, Qu J, Li Y, et al. VOIQ:A Practical High-Performance Architecture for the Implementation of Single-Buffered Routers [C]. In Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region. Washington, DC, USA,2005:505-510.
    [73]Duato J, Flich J, Nachiondo T. A Cost-Effective Technique to Reduce HOL Blocking in Single-Stage and Multistage Switch Fabrics [J]. Proceedings Of the12th Euromicro Conference on Parallel, Distributed and Network-Based Processing.2004,1:48-53.
    [74]Nguyen S T, Oyanagi S. A Low Cost Single-Cycle Router Based on Virtual Output Queuing for On-chip Networks [C]. In Proceedings of the201013th Euromicro Conference on Digital System Design:Architectures, Methods and Tools. Washington, DC, USA,2010:60-67.
    [75]Fick D, DeOrio A, Hu J, et al. Vicis:a reliable network for unreliable silicon [C]. In Proceedings of the46th Annual Design Automation Conference. New York, NY, USA,2009:812-817.
    [76]Jang W, Pan D Z. An SDRAM-aware router for Networks-on-Chip [C]. In USA,2009:800-805.
    [77]Matos D, Concatto C, Kreutz M E, et al. Reconfigurable Routers for Low Power and High Performance [J]. IEEE Transaction on Very Large Scale Integration (VLSI) System.2011,19(11):2045-2057.
    [78]Hirata Y, Matsutani H, Koibuchi M, et al. A variable-pipeline on-chip router optimized to traffic pattern [C]. In Proceedings of the Third International Workshop on Network on Chip Architectures. New York, NY, USA,2010:57-62.
    [79]Matos D, Concatto C, Carro L, et al. The Need for Reconfigurable Routers in Networks-on-Chip [C]. In Proceedings of the5th International Workshop on Reconfigurable Computing:Architectures, Tools and Applications. Berlin, Heidelberg,2009:275-280.
    [80]Ali M, Welzl M, Zwicknagl M, et al. Considerations for fault-tolerant network on chips [C]. In Proceedings of The17th International Conference on Microelectronics.2005:178-182.
    [81]Pande P P, Grecu C, Ivanov A, et al. Design, Synthesis, and Test of Networks on Chips [J]. IEEE Des. Test.2005,22(5):404-413.
    [82]Herve M, Cota E, Kastensmidt F L, et al. Diagnosis of interconnect shorts in mesh NoCs [C]. In Proceedings of the20093rd ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC, USA,2009:256-265.
    [83]Cota E, Kastensmidt F L, Cassel M, et al. A High-Fault-Coverage Approach for the Test of Data, Control and Handshake Interconnects in Mesh Networks-on-Chip [J]. IEEE Transaction on Computer.2008,57(9):1202-1215.
    [84]Grecu C, Pande P, Ivanov A, et al. BIST for Network-on-Chip Interconnect Infrastructures [C]. In Proceedings of the24th IEEE VLSI Test Symposium. Washington, DC, USA,2006:30-35.
    [85]Kohler A, Radetzki M. Fault-tolerant architecture and deflection routing for degradable NoC switches [C]. In Proceedings of the20093rd ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC, USA,2009:22-31.
    [86]Koopman P, Chakravarty T. Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks [C]. In Proceedings of the2004International Conference on Dependable Systems and Networks. Washington, DC, USA,2004:145-154.
    [87]毕占坤,张羿猛,黄芝平等.基于逻辑设计的高速CRC并行算法研究及其FPGA实现[J].仪器仪表学报.2007,28(12):2244-2249.
    [88]董刚,杨海钢.利用无损压缩降低循环冗余校验的错误漏检率及其电路实现[J].电子与信息学报.2010,32(3):705-709.
    [89]施敏加.信息安全中的纠错码理论研究[D].合肥:合肥工业大学,2010:1-86.
    [90]黄正峰,梁华国,陈田等.一种容软错误的BIST结构[J].计算机辅助设计与图形学学报.2009,21(1):33-36.
    [91]Muhammad A, Awais A. Comparative analysis of transient-fault tolerant schemes for network on chips [J]. Journal of Computer and Information Sciences.2008,2(6):386-391.
    [92]Koibuchi M, Matsutani H, Amano H, et al. A Lightweight Fault-Tolerant Mechanism for Network-on-Chip [C]. In Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC, USA,2008:13-22.
    [93]ITRS. International Technology Roadmap for Semiconductors on design [R/OL].2009. http://www.itrs.net/reports.html.
    [94]1800-2005I S. IEEE Standard Association Corporate Advisory Group [M]. New York,USA:IEEE,2005:1-255.
    [95]Yancang C, Luoguo X, Jinwen L. A Trace-driven Hardware-level Simulator for the Design amd Verification of Network-on-Chips,[C]. In Proceedings of the International Conference on Computers, Communications,Control and Automation. Hong Kong, China, February2011:32-35.
    [96]Srikanth V, Meyyappan R. A Practical Guide for SystemVerilog Assertions [M]. New York:Springer,2005:1-325.
    [97]Hu J. Design Methodologies for Application Specific Networks-on-Chip [D]. Pennsylvanian,USA:Carnegie Mellon University,2005.
    [98]Puente V, Gregorio J A, Beivide R. SICOSYS:an integrated framework for studying interconnection network performance in multiprocessor systems [C]. In Proceedings of the10th Euromicro conference on Parallel, distributed and networkbased. Washington, DC, USA,2002:15-22.
    [99]Maurizio P, Davide P, Fabrizio F. Noxim the NoC Simulator User Guide [R/OL].2005. http://www.noxim.org/.
    [100]Jain L. NIRGAM1:A Simulator for NoC Interconnect Routing and Application Modeling [R/OL].2007. http://nirgam.ecs.soton.ac.uk/.
    [101]Lu Z, Thid R, Millberg M, et al. NNSE:Nostrum network-on-chip simulation environment [C]. In Proceeding of Swedish System-on-Chip Conference. Tammsvik,2005:1-4.
    [102]Coppola M, Curaba S, Grammatikakis M D, et al. OCCN:A Network-On-Chip Modeling and Simulation Framework [C]. In Proceedings of Design, Automation and Test in Europe Conference and Exhibition.2004-03-122004:174-179.
    [103]Carsten A, Thilo P, Roman K, et al. Modelling Tile-Based Run-Time Reconfigurable Systems Using SystemC [C]//Zelinka I. In Proceedings of21st European Conference on Modelling and Simulation. Prague, Czech Republic, June 2007:60-63.
    [104]Stergiou S, Angiolini F, Carta S, et al. Xpipes Lite:A Synthesis Oriented Design Library For Networks on Chips [C]. In Proceedings of the conference on Design, Automation and Test in Europe. Washington, DC, USA,2005:1188-1193.
    [105]Chan J, Parameswaran S. NoCGEN:A Template Based Reuse Methodology for Networks on Chip Architecture [C]. In Proceedings of the17th International Conference on VLSI Design. Washington, DC, USA,2004:717-720.
    [106]Chen X, Lu Z, Jantsch A, et al. Run-Time Partitioning of Hybrid Distributed Shared Memory on Multi-core Network-on-Chips [C]. In Proceedings of the20103rd International Symposium on Parallel Architectures, Algorithms and Programming. Washington, DC, USA,2010:39-46.
    [107]Wang L, Song H, Jiang Y, et al. A routing-table-based adaptive and minimal routing scheme on network-on-chip architectures [J]. Comput. Electr. Eng.2009,35(6):846-855.
    [108]Flich J, Rodrigo S, Duato J. An Efficient Implementation of Distributed Routing Algorithms for NoCs [C]. In Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC, USA,2008:87-96.
    [109]Glass C J, Ni L M. The turn model for adaptive routing [C]. In Proceedings of the19th annual international symposium on Computer architecture. New York, NY, USA,1992:278-287.
    [110]Yancang C, Lunguo X, Jinwen L, et al. A deadlock-free fault-tolerant routing algorithm based on pseudo-receiving mechanism for Networks-on-Chip of CMP [C]. In Proceedings of International Conference on Multimedia Technology (ICMT). Hangzhou, China,2011:2825-2828.
    [111]Mejia A, Palesi M, Flich J, et al. Region-based routing:a mechanism to support efficient routing algorithms in NoCs [J]. IEEE transaction of Very Large Scale Integrated System.2009,17(3):356-369.
    [112]Rodrigo S, Flich J, Roca A, et al. Addressing Manufacturing Challenges with Cost-Efficient Fault Tolerant Routing [C]. In Proceedings of the2010Fourth ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC, USA,2010:25-32.
    [113]Song W, Edwards D, Nunez-Yanez J L, et al. Adaptive stochastic routing in fault-tolerant on-chip networks [C]. In Proceedings of the20093rd ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC, USA,2009:32-37.
    [114]Fick D, DeOrio A, Chen G, et al. A highly resilient routing algorithm for faulttolerant NoCs [C]. In Proceedings of the Conference on Design, Automation and Test in Europe.3001Leuven, Belgium, Belgium,2009:21-26. Channels [J]. IEEE Trans. Parallel Distrib. Syst.1996,7(6):620-636.
    [116]Wu J. A deterministic fault-tolerant and deadlock-free routing protocol in2-D meshes based on odd-even turn model [C]. In Proceedings of the16th international conference on Supercomputing. New York, NY, USA,2002:67-76.
    [117]Zhang Z, Greiner A, Taktak S. A reconfigurable routing algorithm for a faulttolerant2D-Mesh Network-on-Chip [C]. In Proceedings of the45th annual Design Automation Conference. New York, NY, USA,2008:441-446.
    [118]Patooghy A, Miremadi S G. XYX:A Power&Performance Efficient Fault-Tolerant Routing Algorithm for Network on Chip [C]. In Proceedings of the200917th Euromicro International on Parallel, Distributed and Network-based Conference. Washington, DC, USA,2009:245-251.
    [119]Pirretti M, Link G M, Brooks R R, et al. Fault tolerant algorithms for network-on-chip interconnect [C]. In Proceedings. IEEE Computer society Annual Symposium on VLSI. Tampa, Florida, USA, February2004:46-51.
    [120]Hosseini A, Ragheb T, Massoud Y. A fault-aware dynamic routing algorithm for on-chip networks.[C]. In Proceedings of the International Symposium on Circuits and Systems. Seattle, Washington, USA, May2008:2653-2656.
    [121]Suh Y-J, Dao B V, Duato J, et al. Software-Based Rerouting for Fault-Tolerant Pipelined Communication [J]. IEEE Transaction on Parallel Distributed System.2000,11(3):193-211.
    [122]Chang J, Sohi G S. Cooperative Caching for Chip Multiprocessors [C]. In Proceedings of the33rd annual international symposium on Computer Architecture. Washington, DC, USA,2006:264-276.
    [123]Duato J. A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks [J]. IEEE Transaction on Parallel Distributed System.1993,4(12):1320-1331.
    [124]Yancang C, Lunguo X. Extending Fault Blocks to Avoid Livelock for a Logic-based Distributed Fault-tolerant Routing algorithm of Networks-on-Chip [J]. Advance Material Research.2012,485(4):536-539.
    [125]Thomas H C, Charles E L, Ronald L R, et al. Introduction to Algorithms [M]. Massachusettes,USA:MIT,2001:1-263.
    [126]Yancang C, Lunguo X, Jinwen L, et al. Slice Router:for Fine-granularity Faulttolerant Networks-on-Chip [C]. In Proceedings of International Conference on Multimedia Technology (ICMT). Hangzhou, China,2011:3230-3233.
    [127]Wolkotte P T, Smit G J M, Rauwerda G K, et al. An Energy-Efficient Reconfigurable Circuit-Switched Network-on-Chip [C]. In Proceedings of the19th IEEE International Parallel and Distributed Processing Symposium (TPDPS'05)-Workshop3-Volume04. Washington, DC, USA,2005:155-162.
    [128]Morgenshtein A, Kolodny A, Ginosar R. Link Division Multiplexing (LDM) for Network-on-Chip Links [C]. In Proceedings of IEEE24th Convention of Electrical and Electronics Engineers in Israel. Eilat, Israel, Feb2006:245-249.
    [129]Leroy A, Milojevic D, Verkest D, et al. Concepts and Implementation of Spatial Division Multiplexing for Guaranteed Throughput in Networks-on-Chip [J]. IEEE Trans. Comput.2008,57(9):1182-1195.
    [130]Liu C, Zhang L, Han Y, et al. A resilient on-chip router design through data path salvaging [C]. In Proceedings of the16th Asia and South Pacific Design Automation Conference. Piscataway, NJ, USA,2011:437-442.
    [131]Kumar P, Pan Y, Kim J, et al. Exploring concentration and channel slicing in on-chip network router [C]. In Proceedings of the20093rd ACM/IEEE International Symposium on Networks-on-Chip. Washington, DC, USA,2009:276-285.
    [132]Xu J, Wolf W, Henkel J, et al. A design methodology for application-specific networks-on-chip [J]. ACM Transaction on Embeded Computer System.2006,5(2):263-280.
    [133]Garey M R, Johnson D S. Computers and Intractability:A Guide to the Theory of NP-Completeness [M]. New York, NY, USA:W. H. Freeman,1979:1-256.
    [134]Orduna J M, Silla F, Duato J. On the development of a communication-aware task mapping technique [J]. Journal of System Architecture.2004,50(4):207-220.
    [135]Shen W-T, Chao C-H, Lien Y-K, et al. A New Binomial Mapping and Optimization Algorithm for Reduced-Complexity Mesh-Based On-Chip Network [C]. In Proceedings of the First International Symposium on Networks-on-Chip. Washington, DC, USA,2007:317-322.
    [136]Lu Z, Xia L, Jantsch A. Cluster-based Simulated Annealing for Mapping Cores onto2D Mesh Networks on Chip [C]. In Proceedings of the200811th IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems. Washington, DC, USA,2008:1-6.
    [137]Hu J, Marculescu R. Energy-aware mapping for tile-based NoC architectures under performance constraints [C]. In Proceedings of the2003Asia and South Pacific Design Automation Conference. New York, NY, USA,2003:233-239.
    [138]Sandvei Jensen B. Optimizing Application Mapping for Network-On-Chip Systems [D]. Denmark:Technical University of Denmark,2008.
    [139]Lei T, Kumar S. A Two-step Genetic Algorithm for Mapping Task Graphs to a Network on Chip Architecture [C]. In Proceedings of the Euromicro Symposium on Digital Systems Design. Washington, DC, USA,2003:180-187.
    [140]Moein-darbari F, Khademzade A, Gharooni-fard G. CGMAP:a new approach to Network-on-Chip mapping problem [J]. IEICE Electronics Express.2009,6(1):
    [141]Shin D, Kim J. Power-aware communication optimization for networks-on-chips with voltage scalable links [C]. In Proceedings of the2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis. New York, NY, USA,2004:170-175.
    [142]Tommi S, Juha-Pekka S. Evaluating application mapping using network simulation [C]. In Proceedings of International Symposium on System-on-Chip. Tampere, Finland, November2003:1-4.
    [143]孙榕,林正浩.基于遗传算法的NoC处理单元映射研究[J].计算机科学.2008,35(4):51-54.
    [144]周干民,尹勇生,胡永华等.基于蚁群优化算法的NoC映射[J].计算机工程与应用.2005,18:7-11.
    [145]Yancang C, Lunguo X, Jinwen L, et al. An Energy-Aware Heuristic Constructive Mapping Algorithm for Network on Chip [C]. In Proceedings of International Conference ASIC. Changsha, China,2009:101-104.
    [146]Murali S, De Micheli G. Bandwidth-Constrained Mapping of Cores onto NoC Architectures[C]. In Proceedings of the conference on Design, automation and test in Europe-Volume2. Washington, DC, USA,2004:896-901.
    [147]Wenbiao Z, Yan Z, Zhigang M. An application specific NoC mapping for optimized delay [C]. In Proceedings of International Conference onDesign and Test of Integrated Systems in Nanoscale Technology. Tunis, October2006:184-188.
    [148]Marcon C, Calazans N, Moraes F, et al. Exploring NoC Mapping Strategies:An Energy and Timing Aware Technique [C]. In Proceedings of the conference on Design, Automation and Test in Europe-Volume1. Washington, DC, USA,2005:502-507.
    [149]Kutami H, Fukushima Y, Fukushi M, et al. Route-Aware Task Mapping Method for Fault-Tolerant2D-Mesh Network-on-Chips.[C]. In Proceedings of IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT). Vancouver, BC,2011:472-480.
    [150]Chou C-L, Ogras U Y, Marculescu R. Energy-and Performance-Aware Incremental Mapping for Networks on Chip With Multiple Voltage Levels [J]. IEEE Transaction on CAD of Integrated Circuits and Systems.2008,27(10):1866-1879.
    [151]Dick R, Rhodes D, Wolf W. TGFF Task Graphs for Free [C]. In Proceedings of International Workshop on Hardware/Software Co-Design. Los Alamitos, CA, USA,1998:97-108.
    [152]Hansson A, Goossens K, Andrei R. A Unified Approach to Mapping and Routing on a Network-on-Chip for Both Best-Effort and Guaranteed Service Traffic [J]. Journal of VLSI Design.2007:1-16.
    [153]Giuseppe A, Vincenzo C, Maurizio P. Mapping Cores on Network-on-Chip [J]. International Journal of Computational Intelligence Research.2005,1(18):109-126.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700