用户名: 密码: 验证码:
半Markov切换空间控制过程及其应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着信息科学技术的迅猛发展和广泛应用,出现了大量反映现代科学技术发展方向的网络通信系统。在应用需求的推动下,网络通信系统的功能日益强大,结构日趋复杂,加上多种控制策略与随机变化的应用环境频繁交互,使得系统的动态行为更加复杂。网络通信系统的性能分析和优化为系统的设计提供依据,为系统的运行提供控制决策方案,在提高系统的运行效率和服务能力并提供服务质量保证方面发挥着关键的作用。在诸如系统与控制、运筹学、计算机科学及人工智能等领域,出现了众多用于研究随机动态系统性能与优化的方法。实际网络通信系统的复杂性与现有优化方法的局限性,是该研究领域所面临的挑战。如何更好地描述系统的特征并加以有效利用,探索新的优化方法,解决实际网络通信系统中存在的关键技术问题,是当前面临的重要理论和应用课题。
     本文研究旨在面向现代网络通信系统中的全新控制对象,探索有效的性能分析与优化方法。系统地提出事件驱动的具有层次化动态结构的半Markov切换空间控制过程,涉及到系统的建模、性能分析、基于事件的优化方法。通过对事件灵活定义和分类,并根据系统的动态特征,从状态空间上将系统划分为多个层级,使得模型的构建具有很高的灵活性、普适性和可扩展性,提高对实际系统的描述能力。采用事件驱动的控制策略,以减小策略空间,大幅度降低优化算法的计算复杂度,提高算法的实时性。利用事件驱动的策略特性及层次化的动态结构信息,来降低优化算法对系统参数信息的依赖,提高算法的自适应性。探索网络通信系统中一类关键技术问题的解决途径,如无线多媒体通信网的适应带宽配置、动态电源管理的策略优化、新一代网络媒体服务系统的建模与优化,为现代信息服务业提供低代价、高性能、可扩展、易管理的控制决策方案。
     通过对不同层级事件的定义,构造在事件驱动控制策略下系统的半Markov核和拟无穷小矩阵,定义了性能函数、切换代价以及优化的性能测度,提出了半Markov切换空间控制过程的模型。基于半Markov性能势,推导半Markov切换空间过程的Poisson方程,进而构造出事件驱动策略的灵敏度公式,包括性能梯度和性能差公式。对于确定型策略的优化,充分利用切换过程半Markov核和拟无穷小矩阵中包含的层次化动态结构信息,推导了事件驱动的切换控制策略的比较定理,摆脱策略迭代算法对转移概率信息的依赖,实现算法的自适应性;利用事件驱动的策略特性,放宽了策略迭代算法的适用性条件即行动不相关的前提,拓展了策略迭代算法的适用范围;通过依据事件进行性能势聚集,减少需要计算或估计的性能势数量,有效降低计算的复杂度和提高算法的实时性。在此基础上,提出了基于单样本轨道的在线自适应策略迭代算法,证明了算法的收敛性。对于随机型策略的优化,在性能梯度公式的基础上,结合性能势基于单样本轨道的表示形式,推导出平均性能测度关于事件驱动的切换控制策略的性能梯度基于单样本轨道的估计式,进而结合随机逼近算法,提出了基于策略梯度的在线自适应优化算法。利用事件驱动的策略特性,有效减小优化算法的计算量,摆脱对系统参数的依赖,提高算法的自适应性。系统的层次化动态结构使得基于梯度的算法能够收敛到全局最优。
     研究无线多媒体通信网的适应带宽配置,通过构建事件驱动的随机分析模型,将适应带宽配置问题转化为一个带约束的随机优化问题。在模型中,适应带宽配置和呼叫准入控制构成一个统一整体,考虑了各类业务的不同优先级,区分了新到达呼叫与越区切换呼叫。考虑三种重要的QoS指标作为约束,提出一种结合策略梯度估计和随机逼近的在线自适应优化算法。该算法充分利用事件驱动的策略特性,计算和评估只需在事件发生的时刻进行,并且不依赖于系统参数的信息,对环境的变化具有良好的适应性,能够保证收敛到全局最优值,有效求解多种QoS指标约束下的适应带宽配置问题。
     考虑动态电源管理的随机型策略和超时策略的优化问题,构建事件驱动的半Markov随机切换模型,通过刻画随机环境中动态电源管理系统所表现出的典型的层次化动态,对实时系统的动态提供了精确的描述。对应于随机型策略和超时策略,提出了两种在线自适应优化算法,充分利用事件驱动的策略特性及系统的层次化动态结构信息,实现了实际应用对算法的自适应性、计算实时性和有效性的需求。通过分析在随机型策略和超时策略控制下动态电源管理系统的稳态特性,揭示了这两种类型的电源管理策略在性能与功耗均衡上的等效性,推导出二者之间的等效关系。
     针对新一代网络媒体服务系统采用分层控制机制的特点,提出了一个三层级的半Markov切换空间控制过程的系统分析模型,为自适应资源部署、动态服务组合和用户请求调度提供了一个综合的性能分析和策略优化框架。提出一种基于动态文件分组的流媒体服务器集群负载均衡策略,充分利用流化服务器的缓存来缓解存储节点I/O带宽的瓶颈,通过提高缓存文件的访问命中率以减少从存储节点读取的频度,并通过均衡集群中各服务器的负载来提高系统资源的可用性。构建一个两层级的半Markov切换控制过程模型对动态文件分组策略进行性能分析,进而采用结合性能势估计和策略迭代的强化学习算法,实现系统和环境参数未知情况下的在线自适应策略优化,有效提高系统资源的利用率。
With the progress of information science and technology, the numerous network communication systems are proliferated around the world. Promoted by increasing service demands, the network communication systems are becoming more functional mightiness, structural complexity. While multiple control strategies are employed to interact with stochastic environment frequently, the dynamics of network communication systems are more complex. Performance analysis and optimization play an important role in the design and operation of network communication systems, and have crucial effects on operation efficiency, service ability and QoS guarantee. There are various approaches such as perturbation analysis in systems and control, Markov decision processes in operation research, reinforce learning in computer science address the optimization of stochastic dynamic systems. The main challenge is that the existing approaches are insufficient to deal with the optimization of complicated network communication systems. The most important issue in related research fields is that how to characterize and utilize the system special features efficaciously, and by which to develop novel optimization approaches for solving the existing key technology problems in real-life network communication systems.
     Motivated by the optimization of modern network communication systems, an analysis and optimization framework called semi-Markov switching state-space control processes (SMSSCPs) is introduced. This framework possesses hierarchical dynamic structure and adopts event-driven control policy. The modeling, performance analysis, and event-based optimization of the proposed processes are discussed. With the definition and classification of events, the state space is divided into multiple layers to characterize the hierarchy of system dynamics, which makes the modeling more flexible and scalable. By adopting event-driven policy, the policy space is reduced, and the computation of associated optimization algorithms can be saved considerably. The feature of event-driven policy and the structure of dynamic hierarchy are exploited to release the dependence of optimization algorithms on the knowledge of systems parameters, and thus improve the adaptability of optimization algorithms. The proposed approach is employed to address some key technology problems in network communication systems, such as adaptive bandwidth allocation in wireless multimedia communication networks, policy optimization of dynamic power management, and modeling and optimization for networked media service systems.
     The analytical model of SMSSCPs is formally introduced by characterizing different layer events, constructing the semi-Markov kernel and semi-infinitesimal generator for the system under control of event-driven policy, and defining the performance and switching cost functions as well as performance measurement. Based on the definition of semi-Markov potential, the Poisson equation is derived for the switching processes, and the performance sensitivity formulas, performance gradient and performance difference, are constructed. For the optimization of deterministic policy, by exploiting the information of dynamic hierarchy contained in semi-Markov kernel or semi-infinitesimal generator, the caparison theorem of event-driven policy is derived. On the basis of this theorem, the dependence of policy iteration on the knowledge of system parameters is released, and the adaptability is obtained. By utilizing the feature of event-driven policy, the applicability condition of policy iteration, i.e. independence-action assumption, is relaxed, and thus the policy iteration can sufficiently handle the optimization problem formulated as SMSSCPs. The potentials are aggregated according to events, and thus the numbers of potentials needed to be calculated are reduced. An online policy iteration algorithm is presented, and its convergence is proved. For the optimization of random policy, on the basis of performance gradient formula, the gradient estimate of average performance with respect to event-driven switching policy based on a single sample path is derived. Combined with stochastic approximation, a policy gradient-based online adaptive optimization algorithm is proposed. By exploiting the feature of event-driven policy, the computation is reduced and the adaptability is improved. On the benefit of hierarchical dynamic structure of SMSSCPs, it is proved that this algorithm can converge to the global optimum with probability 1.
     The issue of adaptive bandwidth allocation in wireless multimedia communication networks is considered firstly. An event-driven stochastic analytical model is introduced to formulate the adaptive bandwidth allocation problem as a constrained optimization problem. In this framework, the adaptive bandwidth allocation and call admission control schemes are combined. The separation between incoming traffic for each class and the high priority of handoff calls over new calls are taken into account. New call blocking probability, handoff dropping probability, and average allocated bandwidth are considered as QoS constrains. An online optimization algorithm that combines policy gradient estimate and stochastic approximation is proposed to handle this constrained problem. This algorithm doesn't depend on the prior knowledge of systems parameters, and is well adaptive to various application environments. Simula- tion results demonstrate that the proposed algorithm is efficient to maximize network revenue while QoS constraints guaranteed.
     Policy optimization of dynamic power management (DPM) is then discussed. For the optimization of stochastic and timeout policies for DPM, event-driven semi-Markov switching models are presented to characterize the hierarchical dynamics of power-managed systems embedded in stochastic environments. These models accurately capture the system dynamics during the transitions between operation states or lasting at idle state waiting for the timeout value expired. The modeling accuracy ensures the reliability of analysis and the effectiveness of optimization. Two online adaptive optimization algorithms are proposed for addressing the optimization of stochastic and timeout policies respectively. By utilizing the feature of event-driven policy and the structure of dynamic hierarchy, the proposed algorithms are adaptive, with less computational cost and power efficient. By analyzing the steady-state behaviors of power-managed systems, the equivalence on power-performance tradeoff of timeout and stochastic policies is revealed, and the equivalent relation between these two types of policy is derived.
     The third application issue is modeling and optimization of networked media service systems. The networked media service systems driven by multi-layer control mechanism are modeled as a three-layer SMSSCP. This model provides a combined analytical framework for policy optimization of adaptive resource allocation, dynamic service composition, and access requests assignment. Streaming media server cluster is the fundamental component of networked media service systems. A dynamic file grouping (DFG) strategy is presented for load balancing in such clusters. This dynamic strategy is based on a two-layer SMSSCP model. It effectively improves the system availability by balancing the workloads among delivery servers within the cluster and increasing the access hit ratio of cached files in delivery servers to mitigate the limitation of I/O bandwidth of storage node. An online policy iteration algorithm is employed to optimize the DFG policy. This algorithm exploits the feature of event-driven policy to alleviate the dependence on the exact knowledge of system parameters such as user access patterns and reduce the computation, which makes it more efficient and feasible in practical applications.
引文
吴琦,熊光泽.非平稳自相似业务下自适应动态功耗管理[J].软件学报,2005,16(8):1499-1505.
    吴越,毕国光.无线多媒体网络中一种基于测量网络状态的动态呼叫接纳控制算法[J].计算机学报,2005,28(11):1823-1830.
    姜爱全,叶晓国,吴家皋.无线/移动网络中基于遗传算法的带宽适应方案[J].计算机研究与发展,2004,41(9):1453-1459.
    Abbasian A,Hatami S,Afzali-Kusha A,Nourani M.2004.Event-driven dynamic power management based on wavelet forecasting theory[C].Proc.IEEE Int.Symp.Circuits Syst.,Vancouver,5:V325-V328.
    Anjali T,Scoglio C,and De Oliverira J C.2005.New MPLS network management techniques based on adaptive learning[J].IEEE Transactions on Neural Networks,16(5):1242-1255.
    Baxter J,Bartlett P L.2000.Direct gradient-based reinforcement learning[C].Proceedings of IEEE International Symposium on Circuits and Systems,Piscataway USA:IEEE Press,271-274.
    Barto A,Mahadevan S.2003.Recent advances in hierarchical reinforcement learning:special issue on reinforcement learning[J].Discrete Event Dynamic Systems:Theory and Application,13:41-77.
    Benini L,Bogliolo A,Paleologo G A,De Micheli G.1999.Policy optimization for dynamic power management[J].IEEE Trans.Computer-Aided Design Integr.Circuits Syst,18(6):813-833.
    Benini L,Bogliolo A,and De Micheli G.2000.A survey of design techniques for system-level dynamic power management[J].IEEE Trans.Very Large Scale Integr.(VLSI)Syst.,8(3):299-316.
    Benjarnin V R.2006.Performance loss bounds for approximate value iteration with state aggregation [J].Mathematics of Operation Research,31(2):234-244.
    Bertsekas D P,Tsitsiklis T N.1996.Neuro-Dynamic Programming[M],Belmont,Massachusetts:Athena Scientific.
    Bertsekas D P.2001.Dynamic Programming and Optimal Control[M].Belmont,MA:Athena Scientific,2nd Edition,vol.1.
    Bertsekas D P.2001.Dynamic Programming and Optimal Control[M],Belmont,MA:Athena Scientific,2nd Edition,vol.2.
    Bertsekas D P,Tsitsiklis T N.2002.Introduce to Probability[M].Belmont,Massachusetts:Athena Scientific,.
    Bhatnagar S, Panigrahi J R. 2006. Actor-critic algorithms for hierarchical Markov decision processes [J]. Automatica, 42(10): 637-644.
    
    Bolch G, Greiner S, de Meer H, Trivedi K S. 2006. Queueing Networks and Markov Chains: Modeling and Performance Evalution with Computer Science Applications [M], Hoboken, New Jersey: Wiley-Interscience, 2ed Edition.
    
    Breuer L, Baum D. 2005. An Introduction to Queueing Theory and Matrix-Analytic Methods [M], Netherlands: Springer.
    
    Cao X-R. 1994. Realization Probabilities: The Dynamic of Queueing Systems [M]. New York: Springer-Verlag.
    
    Cao X-R, Chen H-F. 1997. Potentials perturbation realization, and sensitivity analysis of Markov processes [J]. IEEE Transactions on Automatic Control, 42(10): 1382-1393.
    
    Cao X-R, Wan Y W. 1998. Algorithm for sensitivity analysis of Markov systems through potentials and pertubation realization [J]. IEEE Transactions on Automatic Control, 6(4): 482-494.
    
    Cao X-R, Ren Z-Y, Bhatnagar S, Fu M, and Marcus S. 2002. A time aggregation approach to Markov decision processes [M]. Automatica, 38(6): 929-943.
    
    Cao X-R. 2003. From pertubation analysis to Markov decision processes and reinforcement learning [J]. Discrete Event Dynamic System: Theory and Applications, 13(1): 9-39.
    
    Cao X-R. 2003. Semi-Markov decision problems and performance sensitivity analysis [J]. IEEE Transactions on Automatic Control, 48(5): 758-768.
    
    Cao X-R. 2004. The potential structure of sample paths and performance sensitivities of Markov systems [J], IEEE Transactions on Automatic Control, 49(12): 2129-2142.
    
    Cao X-R. 2005. Basic ideas for event-based optimization of Markov systems [J]. Discrete Event Dynamic Systems: Theory and Application, 15(2): 169-197.
    
    Cao X-R. 2007. Stochastic Learning and Optimization: A Sensitivity-Based Approach [M]. New York: Springer Science+Business Media.
    
    Chang H S, Fard P J, Marcus S I, and Shayman M..2003. Multitime scale Markov decision processes [J]. IEEE Transactions on Automatic Control, 48(6): 976-987.
    
    Chen H F. 2002. Stochastic Aproximation and Its Applications [M]. Kluwer Academic Publishers, 2002.
    
    Chong E K P, Ramadge P J. 1994. Stochastic optimization of regenerative systems using infinitesimal perturbation analysis [J]. IEEE Transactions on Automatic Control, 39(6): 1400-1410.
    
    Chou C-T, Shin Kang G. 2004. Analysis of adaptive bandwidth allocation in wireless networks with multilevel degradable quality of severce [J]. IEEE Transactions on Mobile Computing, 3(1):5-17.
    
    Chung E-Y, Benini L, Bogliolo A , Lu Y-H, De Micheli G. 2002. Dynamic power management for non-stationary service requests [J]. IEEE Transcations on Computer, 51(11): 1345-1361.
    Dai G-P , Yin B-Q, Li Y-J, and Xi H-S. 2005. Performance optimization algorithms based on potentials for semi-Markov control processes [J]. International Journal of Control, 78(7): 801-812.
    
    Dietterich T G. 2000. Hierarchical reinforcement learning with the value function decomposition [J]. Journal of Artifical Intelligence Research, 13: 227-303.
    
    Dietterich T G, Boccadore M, Wardi Y, Egerstedt M, and Verriest E. 2005. Optimal control of switching surfaces in hybrid dynamic systems (J]. Discrete Event Dynamic System: Theory and Application, 15(4): 433-448.
    
    Duong T V, Bui H H, Phung D Q, Venkatesh S. 2005. Activity recognition and abnormality detection with the switching hidden semi-Markov model [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 838-845.
    
    Fang H-T , Cao X-R. 2004. Potential-based online policy iteration algorithms for Markov decision processes [J]. IEEE Transactions on Automatic Control, 49(4): 493-505.
    
    Fine S, Singer Y and Tishby N. 1998. The hierarchical hidden Markov model: analysis and applications [J]. Machine Learning, 32(1): 41-62.
    
    Forestier J and Varaiya P. 1978. Multilayer control of large Markov chains [J]. IEEE Transaction on Automatic Control, AC-23(2): 298-304.
    
    Fu M C, Hu J Q. 1994. Smoothed perturbation ananlysis derivative estimation for Markov chains [J]. Operation Research Letter, 15(2): 241-251.
    
    Ge Z, Ji P, Shenoy P. 2002. A demand adaptive and locality aware streaming media cluster architecture [C]. Proc. Int. Workshop Network Oper. Syst. Support Dig. Audio Video, Miami, 139-146.
    
    Ghahramani Z and Hinton G. 1996. Switching state-space models. Tech. Rep. CRG-TR-96-3, Dept. of Computer Science, University of Toronto, 1996.
    
    Glynn P W. 1987. Likelihood ratio gradient estimation: an overview [C]. Proceedings of 19th Winter Simulation Conference. New York: ACM Press, 90-105.
    
    Gong W B. 1988. Smoothed perturbation analysis of Markov queueing networks [C]. Proceedings of American Control Conference, 456-461.
    
    Greenawalt P M. 1994. Modeling power management for hard disks [C]. Proceedings of International Workshop Modeling, Analysis, and Simulation for Computer and Telecommunication Systems,' Durham, 62-66.
    
    Gu X-H, Nahrstedt K, Chang R N, Ward C. 2003. QoS-assured service composition in managed service overlay networks [C]. Proceedings of 23rd International Conference on Distributed Computing Systems, 194-201.
    
    Gu X-H, Nahrstedt K. 2006. Distributed multimedia service composition with statistical QoS assurances [J]. IEEE Transactions on Multimedia, 8(1): 141-151.
    
    Helmbold D P, Long D D E, and Sherrod B.1996. A dynamic disk spin-down technique for mobile computing [C]. Proc. Annu. Int. Conf. Mobile Comput. Networking, Rye, 130-142.
    
    Hespanha J P. 2005. A model for stochastic hybrid systems with application to communication networks [J]. Nonlinear Analysis, 62(8): 1353-1383.
    
    Ho Y-Q, Zhao Q-C, Pepyne D. 2003. The no free lunch theorem, complexity and compute security [J]. IEEE Transactions on Automatic Control, 48(5): 783-793.
    
    Hu Q-Y. 1997. Discounted sem-Markov decision processesin a semi-Markov environment [J]. Optimization, 39: 367-382.
    
    Hu Q-Y, Wang J-L. 1998. Mixed Markov decision process in a semi-Markov environment with discounted criterion [J]. J. Math. Anal. Appl.,219: 1-20.
    
    Jung H, Pedram M. 2007. A unified framework for system-level design: modeling and performance optimization of scalable networking systems [C]. Proceedings of 8th International Symposium on Quality Electronic Design, 198-203.
    
    Kalasapur S, Kumar M, Shirazi B A. 2007. Dynamic service composition in pervasive computing [J]. IEEE Transactions on Parallel and Distributed Systems, 18(7): 907-918.
    
    Kesidis G. 2007. An Introduction to Communication Networks Analysis [M]. Hoboken, New Jersey: John Wiley & Sons.
    
    Kushner H J, Yin G. 2003. Stochastic Approximation and Recursive Algorithms and Applications [M]. New York: Springer.
    
    Kwon T, Choi Y and Das S. 2002. Bandwidth adaptation algorithms for adaptive multimedia services in mobile cellular networks [J]. Kluwer Wireless Personal Communications, 22(3): 337- 357.
    
    Li M, Wu X-B, Yao R, Yan X-L. 2005. Q-DPM: an efficient model-free dynamic power management technique [C]. Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, Munich, 526-527.
    
    Lim T J. 1998. Stochastic regime switching model for the failure process of a repairable system.Reliability Engineering and System Safety, 59(2): 225-238.
    
    Lu Y-H, De Micheli G. 2001. Comparing system-level power management policies [J], IEEE Des. Test. Comput., 18(2): 10-19.
    
    Luperello D, Mukherjee S, and Paul S. 2002. Streaming media traffic: an empirical study [C]. Proceedings of Web Caching Workshop.
    
    Madani M N and Masoumi N. 2004. A new optimization method for CTMDP system-level power management techniques [C]. Proc. 16th Int. Conf. Microelectron. ICM, Tunis, 215-218.
    
    Marbach P and Tsitsiklis J N. 2001. Simulation-based optimization of Markov reward processes [J]. IEEE Transactions on Automatic Control, 46(2): 191-209.
    
    Medhi J. 2003. Stochastic Models in Queueing Theory [M]. Amisterdam: Academic Press.
    Naidu D S. 2002. Singular perturbations and time scales in control theory and applications: an overview [J]. Dynamic of Continuous, Discrete and Impulsive Systems, Series B: Applications and Algorithms, 9: 233-278.
    
    Nasser N, Hassanein H. 2004. Connection-level performance analysis for adaptive bandwidth allocation in multimedia wireless cellular networks [C]. Proceeding of IEEE international Conference on Performance, Computing, and Communications. Piscataway: IEEE Press, 61-68.
    
    Otranto E. The multi-chain Markov switching model [J]. Journal of Forecasting, 24(7): 523-537.
    
    Parr R. 1998. Hierarchical control and learning for Markov decission processes [D], PHD thesis, Berkeley, CA: University of California.
    
    Parr R, and Russell S. 1998. Reinforcement learning with hierarchies of machines [C]. In Advances in Neural Information Processing Systems: Proceedings of the 1997 Conference, Cambridge.
    
    Pepyne D L, Wardi Y. 2001. Optimal control of a class of hybrid systems [J]. IEEE Transactions on Automatic Control, 46(3): 398-415.
    
    Popescu A C, Wong Y-S. 2005. Nested Monte Carlo EM algorithm for switching state-space models [J]. IEEE Transactions on Knowledge and Date Engineering, 17(12): 1653-1663.
    
    Puterman M L. 1994. Markov Decesion Processes: Discrete Stochastic Dynamic Programming [M]. New York: Wiley.
    
    Qiu Q, Wu Q and Pedram M. 2001. Stochastic modeling of a power-managed system—construction and optimization [J]. IEEE Trans. Computer-Aided Design Integr. Circuits Syst., 20(10): 1200-1217.
    
    Ren Z-Y and Krogh B H. 2001. Switching control in multi-mode Markov decision processes [C].
    Proceedings of the 40th IEEE Conference on Decision and Control, Orlando, 2095-2101.
    
    Ren Z-Y and Krogh B H. 2002. State aggregate in Markov decision processes [C]. Proceedings of the 41st IEEE Conference on Decision and Control, Las Vegas, 3819-3824.
    
    Rong P, Pedram M. 2006. Battery-aware power management based on Markovian decision processes [J] IEEE Trans, on Computer-Aided Design of Integr. Circuits Syst., 25(7): 1337-1349.
    
    Rong P, Pedram M. 2006. Determining the optimal timeout values for a power-managed system based on the theory of Markovian processes: offline and online algorithms [C]. Proceedings of Design, Automation and Test in Europe, Munich, 1: 1128-1133.
    
    Simunic T, Benini L, Glynn P, and De Micheli G..2001. Event-driven power management [J]. IEEE Trans, on Computer-Aided Design of Integr. Circuits Syst., 20(7): 840-857.
    
    Srivastava M, Chandrakasan A and Brodersen R. 1996. Predictive system shutdown and other architectural techniques for energy efficient programmable computation [J]. IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 4(1): 42-55.
    
    Stroock D W. 2005. An Introduction to Markov Processes [M], Berlin Heidelberg: Springer-Verlag.
    Sutton R S. 1996. Generalization in reinforcement learning: successful examples using sparse coarse coding [C]. Proceedings of the 1995 Conference on Advances in Neural Information Processing Systems, 1038-1044.
    
    Sutton R S, Barto. 1998. Reinforcement Learning: An Introduction [M]. Cambridge, MA: MIT Press.
    
    Sutton R S, Precup D, Singh S. 1999. Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning [J]. Artifical Intelligence, 112: 181-211.
    
    Tang S-S, Li W and Kim J. 2005. Modeling adaptive bandwidth allocation scheme for multi-service wireless cellular networks [C]. Proceedings of IEEE International Conference on Wireless and Mobile Computing, Networking and Communications. Piscataway: IEEE Press, 2: 189-195.
    
    Tewari R, Mukherjee R, Dias D, and Vin H. 1996. Design and performance tradeoffs in clustersd video servers [C]. Proceedings of the 3rd IEEE International Conference on Multimedia Computation and Systems, 144-150.
    
    Tsao S-L, Chen M-C, Ko M-T. 1999. Data allocation and dynamic load balancing for distributed video storage server [J]. Journal of Visual Communication and Image Representation, 10(2): 197-218.
    
    Wan Y-W, Cao X-R. 2006. The control of a two-level Markov decision process by time aggregation [J]. Automatica, 42(3): 393-403.
    
    Weng L-C, Wang X-J, Liu B. 2003. A survey of dynamic power optimization techniques [C]. Proceedings of The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, Calgary, 48-52.
    
    Wu J, Abere, K. 2005. Using a layered Markov model for distributed Web ranking computation[C]. Proceedings of 25th IEEE International Conference on Distributed Computing Systems, 533-542.
    
    Xiao Y, Chen C and Wang Y. 2001. Fair bandwidth allocation for multi-class of adaptive multimedia services in wireless/mobile networks [C]. Proceedings of IEEE 53rd Vehicular Technology Conference. Piscataway: IEEE Press, 2081-2085.
    
    Yin G, Zhang Q, Moore J B, Liu Y J. 2005. Continuous-time tracking algorithms involving two-time-scale Markov chains [J]. IEEE Transactions on Signal Processing, 53(12): 4442-4452.
    
    Yin B-Q, Dai G-P, Li Y-J, Xi H-S. 2007. Sensitivity analysis and estimates of the performance for M / G /1 queueing systems [J]. Performance Evaluation, 64(4): 347-356.
    
    Yu F, Wong W S and Leung Victor C M. 2006. Efficient QoS provisioning for adaptive multimedia in mobile communication networks by reinforcement learning [J]. Mobile Networks and Applications, 11(1): 101-110.
    
    Zhang Q, Riska A, Sun W, smirni E, Ciardo G. 2005. Workload-aware load balancing for clustered web servers [J]. IEEE Transactions on Parallel and Distributed Systems, 16(3): 219-233.
    Zhang Q, Gherkasova L, Smirni E. 2006. FlexSplit: a workload-aware, adaptive load balancing strategy for media cluster [C]. Proceedings of Multimedia Computing and Networking, San Jose, 607101.
    
    Zheng R, Hou Jennifer C, Sha L. 2004. On timeout driven power management policies in wireless networks [C]. Proc. IEEE Global Telecommun. Conf., Dallas, 6: 4097-4103.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700