用户名: 密码: 验证码:
多智能体机器人系统中的若干通信技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
利用通信提高多智能体机器人系统协调控制的性能是近年来多机器人和多智能体领域中的研究热点之一。如何通过通信进行信息共享对于合作与协调至关重要,本文介绍了多智能体机器人系统的通信方式,对合作中通信方面当前关注的主要研究内容和方法进行了系统地总结和综述,对比和分析了近年来基于通信的分布式控制系统典型的建模方法。在此基础上,对基于通信的多智能体机器人系统协调控制中的若干关键问题进行了较深入的分析和研究。具体包括以下几方面研究内容:
     建立了无偿通信情况下多智能体机器人系统协调控制的集中式模型。将通信代价参数化表示后引入模型,建立了无偿通信时团队协调的集中式控制模型,即无偿通信的存在可以将多智能体部分可观察马尔可夫决策过程(POMDP)简化为单智能体POMDP。为求解带有不确定性的POMDP近似最优策略,提出了一种新的方法,利用结合进化算法的强化学习来估计POMDP的最优解。利用Memetic算法来进化策略,而Q学习算法得到预测奖励来指出进化策略的适应度值。针对隐状态问题,通过记忆智能体最近经历的确定性的有限步历史信息,与表示所有可能状态上的概率分布的信度状态相结合,共同决策当前的最优策略。利用一种混合搜索方法来提高搜索效率,其中调整因子被用于保持种群的多样性,并且指导组合式交叉操作与变异操作。在POMDP典型实例问题上的实验结果证明本文提出的算法性能优于其他的POMDP近似算法。最后针对无偿通信时多智能体机器人系统协调问题进行了有效性实验。
     无偿通信可以将多智能体POMDP的计算复杂度简化为单智能体POMDP的计算复杂度,然而实际应用中通信不是无偿的,常常期望减少多智能体机器人系统协调所需通信的数量。为此提出了一种新的分散式通信决策算法,利用有向无环图表示团队的可能联合信度,基于此以分散式的方式制定通信决策,仅当智能体自身的观察信息显示共享信息将导致期望回报升高时选择通信。通过维持以及推理团队的可能联合信度将集中式单智能体策略应用于分散式多智能体POMDP问题。通过实验以及一个详细的实例表明,利用我们提出的DAG_DEC_COMM分散式通信决策算法能够有效地减少通信资源的使用,同时提高分散执行的性能。
     不可靠的通信是众多多智能体实际应用领域的基本特征。有限的带宽、干扰以及视线是通信失败的主要原因。本文在分布式约束优化问题框架下研究了改进的分布式约束推理算法,使其能有效地运行在不可靠的通信条件下。为了减少不必要的通信量,提高算法性能,改进了Adopt算法,使其在保证活性的前提下减少了搜索最优解所需通信消息的数目。此外,分析了引起消息丢失的原因,提出了兼顾两种原因引起的消息丢失的改进方法。结果显示改进后的Adopt算法在通信不可靠时也能保证终止于最优解,并且得到解的时间随着消息丢失概率的增加适度地降低了。
     近年来多智能体联合作业受到显著关注。人、智能体混合团队得到了广泛应用。本文研究并设计实现了一种基于移动信息设备的多智能体人-机器人混合团队系统。首先提出了一种基于移动信息设备的多智能体人-机器人混合团队系统的体系结构,然后设计并实现了基于移动信息设备的人与机器人之间以及多机器人间的通信系统,实现了团队成员间的信息共享。最后利用实验来验证本文的方法,实验结果表明用户能在自然、便捷的方式下进行人-机器人交互,完成远程监控任务,多机器人通过通信将各个机器人的局部环境模型构建成团队环境模型,有利于提高团队协调工作的性能。
Considerable attention has been devoted to utilize communication to improve the performance of coordination of multi-agent robotic system in the field of Multi-Agent System and Multi-Robot System. How to share information among multiply robots by communication is a key technique for coordination and cooperation. First of all, three communication methods in the decentralized control system are introduced. Then, the major topics and state-of-the-art of communication in the cooperation are summarized and reviewed. The methods of modelling the communicative decentralized control system are described and their advantages and shortcomings are analyzed and compared. The following are studied further.
     When communication is free, the central control model for coordination of the multi-agent robotic system is established after the representation of communication cost is parameterized. The presence of free communication reduces the computation complexity of multi-agent POMDPs to that of single-agent POMDPs. In this paper, a novel approximate algorithm, called Memetic algorithm based Q-Learning (MA-Q-Learning), is proposed as a means to solve the POMDP problems which has the uncertainty problems. The policies are evolved using memetic algorithms, whereas the improved Q-learning obtains predictive rewards to indicate fitness of the evolved policies. In order to solve the hidden state problem, historical information is incorporated with the current belief state to aid in finding the optimal policy. Finally, the search efficiency is improved by a hybrid search method, in which an adjustment factor is used to help keep the diversity of population and guide the crossover based on the combination of multiple kinds of crossover and mutation. The experiments conducted on benchmark datasets show that the proposed methodology is superior to other state-of-the-art POMDP approximate methods. Finally, the experiments on the coordination of multi-agent system validate the algorithm’s effectiveness.
     Although the presence of free communication reduces the computation complexity of multi-agent POMDPs to that of single-agent POMDPs, in practice, communication is not free and reducing the amount of communication is often desirable. In order to reduce the amount of communication in the coordination of multi-agent robotic system, this paper presents a novel approach for making communication decision in a decentralized fashion, and the possible joint beliefs of the team are represented based on a directed acyclic graph. And communication is chosen only when an agent’s local observations indicate that sharing information would lead to an increasing in expected reward. It is described how to apply centralized single-agent policies to decentralized multi-agent POMDPs by maintaining and reasoning over the possible joint beliefs of the team. Experiment and a detailed example show that the proposed DAG-DEC-COMM algorithm can reduce communication while improving the performance of distributed execution.
     Unreliable communication is a common feature of many real-world applications of multi-agent domains, especially of multi-agent robot system. Limited bandwidth, interference and loss of line-of-sight are some reasons why communication can fail. We introduce an improved Adopt algorithm for operating effectively over unreliable communication infrastructure in the context of the Distributed Constraint Optimization Problem (DCOP). The key idea in our approach is to let the improved algorithm reduce unnecessary communication and an adaptive mechanism of timeout is used in order to ensure the liveness to find the optimal solution. Thus, the number of messages communicated is decreased. Furthermore, the adaptive timeout can allow the algorithm to flexibly and robustly deal with message loss. Results show that with a few modifications, Adopt can be guaranteed to terminate with the optimal solution even in the presence of message loss and that time to solution degrades gracefully as message loss probability increases. The results also suggest that artificially introducing message loss even when communication infrastructure is reliable could be beneficial in terms of the amount of work agents need to do to find the optimal solution.
     Recent researches focus on multi-agent teamwork. Multi-agent human-robot team has applied to many fields. This paper investigates and develops a system of multi-agent human-robot with mobile information devices. Firstly, this paper presents the architecture of multi-agent human-robot with mobile devices, and designs and implements the communication system between human and robot and among robots. Information sharing is realized among the members of the team. Finally, the results of experiments show that this system is user-friendly and can effectively undertake the remote monitoring and control tasks and robot members of the team can construct the world model by communicating member’s local environment information, which is beneficial to team coordination.
引文
1 T. Balch, R. C. Arkin. Communication in Reactive Multiagent Robotic Systems. Autonomous Robots. 1994,1(1):27~52
    2 谭民,王硕,曹志强. 多机器人系统. 北京:清华大学出版社,2005
    3 C. V. Goldman, S. Zilberstein. Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis. Journal of Artificial Intelligence Research. 2004,(22):143~174
    4 L. Panait, S. Luke. Cooperative Multi-Agent Learning: The State of the Art. Autonomous Agents and Multi-Agent Systems. 2005,11(3):387~434
    5 P. Ulam, R. C. Arkin. When Good Comms Go Bad: Communications Recovery for Multi-Robot Teams. Proceedings of IEEE International Conference on Robotics & Automation, New Orleans, LA, USA, 2004:3727~3734
    6 S. Russell, P. Norvig. Artificial Intelligence: A Modern Approach. Second Edition. USA: Pearson Education Inc., 1995
    7 M. Wooldridge. An Introduction to MultiAgent Systems. USA: John Wiley & Sons, Inc., 2003
    8 K. Tuyls. Learning in Multi-Agent Systems: An Evolutionary Game Theoretic Approach. Vrije Universiteit Brussel, Belgium, Doctoral Thesis, 2004
    9 P. Stone, M. Veloso. Multiagent Systems: A Survey from a Machine Learning Perspective. Autonomous Robotics. 2000,8(3):345~383
    10 C. Boutilier. Sequential Optimality and Coordination in Multiagent Systems. Proceedings of the 16th International Joint Conferences on Artificial Intelligence (IJCAI-99), Stockholm, 1999:478~485
    11 P. Xuan, V. Lesser. Multi-Agent Policies: From Centralized Ones to Decentralized Ones. Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems, Bologna, Italy, 2002. New York, USA, ACM press, 3: 1098~1105
    12 A. Stentz, B. L. Brumitt. Dynamic Mission Planning for Multiple Mobile Robots. Proceedings of the IEEE International Conference on Robotics andAutomation, 1996, 3:2396~2401
    13 P. Xuan, V. Lesser, S. Zilberstein. Communication Decisions in Multi-agent Cooperation: Model and Experiments. Proceedings of the Fifth International Conference on Autonomous Agents. New York, USA, ACM Press, 2001:616~623
    14 N. Vlassis. A Concise Introduction to Multiagent Systems and Distributed AI. Technical report, University of Amsterdam, Netherlands, 2003
    15 J. S. Rosenschein. Rational, Interaction: Cooperation among Intelligent Agents. Stanford Univ, PhD Thesis,1986
    16 董红斌,王建华. 多 Agent 技术研究. 计算机应用研究, 1999,(10):29~30
    17 S. Ali, S. Koenig and M.Tambe. Preprocessing Techniques for Accelerating the DCOP Algorithm Adopt. Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems(AAMAS), Utrecht, Netherlands, 2005:1041~1048
    18 R. Mailler. Comparing Two Approaches to Dynamic, Distributed Constraint Satisfaction. Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems(AAMAS), Utrecht, Netherlands, 2005:1049~1056
    19 P. Scerri, D. Pynadath and M. Tambe. Towards Adjustable Autonomy for the Real-World. Journal of Artificial Intelligence Research, 2002,17:171~228
    20 R. Nair, M.Tambe, M.Yokoo, et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings. Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence. Mexico, Morgan Kaufrmann Press, 2003:705~711
    21 R. Maheswaran, T. Basar. Coalition Formation in Proportionality Fair Divisible Auctions. Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems. Melbourne, Australia, 2003:25~32
    22 R. Nair, M.Tambe. Hybrid BDI-POMDP Framework for Multiagent Teaming. Journal of Artificial Intelligence Research, 2005,23:367~413
    23 G. M. Saunders, J. B. Pollack. The Evolution of Communication Schemes over Continuous Channels. From Animals to Animats 4: Proceedings of the4th International Conference on Simulation of Adaptive Behavior. Cambridge, MIT Press, 1996:580~589
    24 Y. Xu, M. Lewis, K. Sycara, et al. Information Sharing in Large Scale Teams. AAMAS'04 Workshop on Challenges in Coordination of Large Scale MultiAgent Systems, 2004
    25 D. V. Pynadath, M. Tambe. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models. Journal of Artificial Intelligence Research, 2002,(16):389~423
    26 L. E. Parker, K. Fregene, Y. Guo, et al. Distributed Heterogeneous Sensing for Outdoor Multi-robot Localization, Mapping, and Path Planning. In A. C. Schultz and L. E. Parker, editors, Multi-Robot Systems: From Swarms to Intelligent Automata. Kluwer Academic Publishers, 2002:21~30
    27 I. M. Rekleitis, G. Dudek and E. E. Milios. Multi-robot Collaboration for Robust Exploration. Annals of Mathematics and Artificial Intelligence, 2001,31(1-4): 7~40
    28 T. Bandlow, M. Klupsch, R. Hanek, et al. Fast Image Segmentation, Object Regocnition and Localization in a RoboCup Scenario. RoboCup-99: Robot Soccer World Cup III. London, UK, Springer-Verlag, 2000:174~185
    29 W. Uther, S. Lenser, J. Bruce, et al. CM-Pack’01: Fast Legged Robot Walking, Robust Localization, and Team Behaviors. RoboCup-2001: The Fifth RoboCup Competitions and Conferences. London, UK, Springer-Verlag, 2002:693~696
    30 R. Hanek, T. Schmitt, M. Klupsch, et al. From Multiple Images to a Consistent View. RoboCup 2000: Robot Soccer World Cup IV. London, UK, Springer-Verlag, 2001:169~178
    31 M. Dietl, J. S. Gutmann and B. Nebel. CS Frieburg: Global View by Cooperative Sensing. RoboCup-2001: The Fifth RoboCup Competitions and Conferences. London, UK, Springer-Verlag, 2002:133~143
    32 M. Roth, D. Vail and M. Veloso. A Real-time World Model for Multi-Robot Teams with High-Latency Communication. Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, USA, 2003, 3: 2494~2499
    33 洪炳镕,韩学东. 机器人足球比赛研究. 机器人. 2003,25(4):373~377
    34 M. Luk, G. Mezzour, A. Perrig, et al. MiniSec: A Secure Sensor Network Communication Architecture. Proceedings of the 6th International Conference on Information Processing in Sensor Networks, Cambridge, USA, 2007. New York, USA, ACM Press:479~488
    35 Sub-kilogram Intelligent Tele-robots (SKITs), http://wwwrobotics.usc.edu/ ~behar/SKIT.html, 2004
    36 R. Becker, V. Lesser and S. Zilberstein. Analyzing Myopic Approaches for Multi-Agent Communication. Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology. USA, IEEE, 2005:550~557
    37 C. V. Goldman, S. Zilberstein. Optimizing Information Exchange in Cooperative Multi-agent Systems. Proceedings of the Second International Joint Conference on Autonomous Agents and Multi Agent Systems. USA, ACM Press, 2003:137~144
    38 M. Roth, R. Simmons and M. Veloso. Decentralized Communication Strategies for Coordinated Multi-Agent Policies. Multi-Robot Systems. From Swarms to Intelligent Automata. USA, Springer-Verlag,2005,III: 93~106
    39 R. Makar, S. Mahadevan and M. Ghavamzadeh. Hierarchical Multi-agent Reinforcement Learning. Proceedings of the Fifth International Conference on Autonomous Agents. New York, USA, ACM Press, 2001:246~253
    40 M. Ghavamzadeh, S. Mahadevan. Learning to Communicate and Act in Cooperative Multiagent Systems using Hierarchical Reinforcement Learning. In:Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems. USA, IEEE, 2004:1114~1121
    41 F. Fischer, M. Rovatsos and G. Weiss. Hierarchical Reinforcement Learning in Communication-Mediated Multiagent Coordination. Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems(AAMAS 04). USA, IEEE, 2004:1334~1335
    42 J. Shen, V. Lesser and N. Carver. Minimizing Communication Cost in a Distributed Bayesian Network Using a Decentralized MDP. Proceedings of the Second International Joint Conference on Autonomous Agents and Multi Agent Systems. USA, ACM Press, 2003:678~685
    43 A. Wagner, R. C. Arkin. Internalized Plans for Communication-sensitive Robot Team Behaviors. Proceedings of 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003). USA, IEEE, 2003:2480~2487
    44 S. Anderson, R. Simmons and D. Goldberg. Maintaining Line of Sight Communications Networks between Planetary Rovers. Proceedings of 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).USA, IEEE, 2003:2266~2272
    45 R. C. Arkin, J. Diaz. Line-of-sight Constrainted Exploration for Reactive Multiagent Robotic Teams. Proceedings of 7th International Workshop on Advanced Motion Control. USA, IEEE, 2002:455~461
    46 P. J. Modi, S. M. Ali, W. Shen, et al. Distributed Constraint Reasoning under Unreliable Communication. Proceedings of Distributed Constraint Reasoning Workshop at Second International Joint Conference on Autonomous Agents and Multi-Agent Systems. Australia, 2003.
    47 B. J. MacLennan, G. M. Burghardt. Synthetic Ethology and the Evolution of Cooperative Communication. Adaptive Behavior, 1993, 2(2):161~188
    48 A. Walker, M. Wooldridge. Understanding the Emergence of Conventions in Multi-agent Systems. Proceedings of the First International Conference on Multi-Agent Systems. Cambridge, MIT Press,1995:384~389
    49 L. Steels. Self-organizing Vocabularies. Artificial Life V, Nara, Japan, 1996: 179~184
    50 K. C. Jim, C. L. Giles. How Communication can Improve the Performance of Multi-agent Systems. Proceedings of the fifth international conference on Autonomous agents. Canada, ACM Press, 2001:584~591
    51 P. J. Gmytrasiewicz, M. Summers, D. Gopal. Toward Automated Evolution of Agent Communication Languages. Proceedings of the 35th Hawaii International Conference on System Sciences (HICSS-35). Hawaii, 2002.
    52 E. D. De Jong, L. Steels. A Distributed Learning Algorithm for Communication Development. Complex Systems. 2004, 14(4):315~334
    53 C. V. Goldman, M. Allen and S. Zilberstein. Decentralized Language Learning through Acting. Proceedings of Third International Conference on Autonomous Agents and Multiagent Systems(AAMAS 04). USA, IEEE,2004:1006~1013
    54 P. J. Gmytrasiewicz, E. H. Durfee and J. Rosenschein. Toward Rational Communicative Behavior. AAAI Fall Workshop on Embodied Language and Action. Cambridge, Massachusetts, AAAI Press, 1995:35~43
    55 K. Hasida, K. Nagao and T. Miyata. A Game-Theoretic Account of Collaboration in Communication. Proceedings of the First International Conference on Multi-Agent Systems (ICMAS), MIT Press, 1995:140~147
    56 Y. Shoham, R. Powers and T. Grenager. Multi-agent Reinforcement Learning: A Critical Survey. California: Stanford University, 2003
    57 L. Busoniu, B. De Schutter and R. Babuska. Learning and Coordination in Dynamic Multiagent Systems. Delft University of Technology, Delft Center for Systems and Control, Technical report 05-019, 2005
    58 E. F. Yang, D. B. Gu. Multiagent Reinforcement Learning for Multi-Robot Systems: A Survey. Department of Computer Science, University of Essex, Technical Report CSM-404. 2004
    59 G. L. Peterson, D. J. Cook. Incorporating Decision-theoretic Planning in a Robot Architecture. Robotics and Autonomous Systems. 2003, 42:89~106
    60 L. P. Kaelbling, M. L. Littman and A. R. Cassandra. Planning and Acting in Partially Observable Stochastic Domains. Artificial Intelligence. 1998, 101: 99~134
    61 M. L. Littman. The Witness Algorithm: Solving Partially Observable Markov Decision Processes. Brown University, Department of Computer Science, Technical Report CS-94-40, 1994
    62 N. L. Zhang, W. Liu. Planning in Stochastic Domains: Problem Characteristics and Approximation. Hong Kong University of Science and Technology, Department of Computer Science, Technical Report HKUST-CS96-31, 1996
    63 A. Cassandra, M. Littman and N. Zhang. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes. Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence, Providence, Rhode Island, USA, 1997:54~61
    64 N. L. Zhang, W. Zhang. Speeding up the Convergence of Value Iteration in Partially Observable Markov Decision Processes. Journal of ArtificialIntelligence Research. 2001, 14:29~51
    65 J. Pineau, G. Gordon and S. Thrun. Point-based Value Iteration: An Anytime Algorithm for POMDPs. Proceedings of International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 2003. AAAI:1025~1032
    66 S. W. Hasinoff. Reinforcement Learning for Problems with Hidden State. University of Toronto, Department of Computer Science, Technical Report, 2002
    67 J. Baxter, P. L. Bartlett. Reinforcement Learning in POMDP’s via Direct Gradient Ascent. Proceedings of 17th International Conference on Machine Learning. San Francisco, CA, Morgan Kaufmann, 2000:41~48
    68 R. Salustowicz, J. Schmidhuber. Probabilistic Incremental Program Evolution. Evolutionary Computation. 1997, 5(2):123~141
    69 D. E. Moriarty, A. C. Schultz and J. J. Grefenstette. Evolutionary Algorithms for Reinforcement Learning. Journal of Artificial Intelligence Research. 1999, 11: 199~229
    70 E. A. Hansen, D. S. Bernstein and S. Zilberstein. Dynamic Programming for Partially Observable Stochastic Games. Proceedings of National Conference on Artificial Intelligence. San Jose, California, 2004:709~715
    71 R. Becker, S. Zilberstein, V. Lesser, et al. Solving Transition Independent Decentralized Markov Decision Processes. Journal of Artificial Intelligence Research. 2004, 22:423~455
    72 R. Emery-Montemerlo, G. Gordon, J. Schneider, et al. Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs. Proceedings of International Joint Conference on Autonomous Agents and Multi-Agent Systems, New York, 2004. IEEE Computer Society, Washington, DC, USA:136~143
    73 D. S. Bernstein, S. Zilberstein and N. Immerman. The Complexity of Decentralized Control of Markov Decision Processes. Proceedings of the Sixteenth Conference on University in Artificial Intelligence, Stanford, California, 2000:127~138
    74 B. Holldobler, E. O. Wilson. The Ants. USA:Harvard University Press, 1990
    75 N. D. Monekosso, P. Remagnino. An Analysis of the Pheromone Q-learningAlgorithm. Proceedings of the VIII Iberoamerican Conference on Artificial Intelligence IBERAMIA-02, Sevilla, Springer Verlag, 2002:224~232
    76 J. Sauter, R. S. Matthews, H. Van Dyke Parunak, et al. Evolving Adaptive Pheromone Path Planning Mechanisms. Proceedings of First International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS-02), Italy, ACM Press, 2002:434~440
    77 M. Tan. Multi-agent Reinforcement Learning: Independent vs. Cooperative Learning. Proceedings of Tenth International Conference on Machine Learning, San Francisco, Morgan Kaufmann, 1993:487~494
    78 H. Berenji, D. Vengerov. Advantages of Cooperation between Reinforcement Learning Agents in Difficult Stochastic Problems. Proceedings of 9th IEEE International Conference on Fuzzy Systems, USA, IEEE, 2000:871~876
    79 K. Wagner. Cooperative Strategies and the Evolution of Communication. Artificial Life. 2000, 6(2):149~179
    80 H. Yanco, L. Stein. An Adaptive Communication Protocol for Cooperating Mobile Robots. From Animals to Animats: International Conference on Simulation of Adaptive Behavior, USA, MIT Press, 1993:478~485
    81 K. C. Jim, C. L. Giles. Talking Helps: Evolving Communicating Agents for the Predator-prey Pursuit Problem. Artificial Life. 2000, 6(3):237~254
    82 L. Steels. Emergent Adaptive Lexicons. Proceedings of the Simulation of Adaptive Behavior Conference, USA, MIT Press, 1996:562~567
    83 L. Steels, F. Kaplan. Collective Learning and Semiotic Dynamics. Proceedings of the European Conference on Artificial Life, Springer- Verlag, 1999:679~688
    84 A. Cangelosi. Evolution of Communication and Language Using Signals, Symbols, and Words. IEEE Transactions on Evolutionary Computation. 2001, 5(2):93~101
    85 L. Peshkin, K. E. Kim, N. Meuleau, et al. Learning to Cooperate via Policy Search. Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, USA, Morgan Kaufmann, 2000: 489~496
    86 R. Nair, M. Roth, M. Yokoo, et al. Communication for Improving Policy Computation in Distributed POMDPs. Proceedings of The ThirdInternational Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-04), USA, IEEE, 2004:1098~1105
    87 D. Burago, M. De Rougemont, A. Slissenko. On the Complexity of Partially Observed Markov Decision Processes. Theoretical Computer Science, 1996, 157(2):161~183
    88 R. Simmons, S. Koenig. Probabilistic robot navigation in partially observable environments. Proc. Int. Joint Conf. on Artificial Intelligence, 1995: 1080~1087
    89 D. Bagnell, J. Schneider. Autonomous Helicopter Control Using Reinforcement Learning Policy Search Methods. Proceedings of IEEE International Conference on Robotics and Automation, Seoul, Korea, 2001: 1615~1620
    90 M. L. Littman, A. Cassandra and L. Kaelbling. Learning Policies for Partially Observable Environments: Scaling up. Proceedings of the Twelfth International Conference on Machine Learning. San Francisco, CA, Morgan Kaufmann Publishers, 1995:362~370
    91 高阳, 陈世福, 陆鑫. 强化学习研究综述. 自动化学报. 2004, 30(1): 86~100
    92 陈卫东, 席裕庚, 顾冬雷. 自主机器人的强化学习研究进展. 机器人. 2001, 23(4): 379~384
    93 M. Guo, Y. Liu and J. Malec. A New Q-learning Algorithm based on the Metropolis Criterion. IEEE Transactions on Systems, Man and Cybernetics. 2004, 34(5):2140~2143
    94 P. Moscato. Memetic Algorithms: A Short Introduction. New Ideas in Optimization, McGraw-Hill, London, UK, 1999:219~234
    95 V. N. VAPNIK. Statistical Learning Theory. New York: Wiley-Inter science, 1998
    96 高大启.有教师的线性基本函数前向三层神经网络结构研究. 计算机学报. 1998, 21(1): 80~86
    97 X. H. Su, B. Yang, Yadong Wang. A Genetic Algorithm based on Evolutionarily Stable Strategy. Journal of Software. 2003, 14(11):1863~1868
    98 耿素云, 屈婉玲. 离散数学. 北京: 高等教育出版社,199899 R. Mailler, V. Lesser. Solving Distributed Constraint Optimization Problems Using Cooperative Mediation. Proceedings of Third International Joint Conference on Autonomous Agents and Multiagent Systems(AAMAS 2004), IEEE Computer Society, 2004:438~445
    100 P. Modi, W. Shen, M. Tambe, et al. ADOPT: Asynchronous Distributed Constraint Optimization with Quality Guarantees. Artificial Intelligence. 2005, 161:149~180
    101 贺利坚, 张伟. 基于约束图分片求解 DCOP 的 Agent 组织结构. 计算机研究与发展. 2007, 44(3): 434~438
    102 M. Yokoo. Distributed Constraint Satisfaction: Foundation of Cooperation in Multi-agent Systems. Springer, 2001
    103 W. X. Zhang, L. Wittenburg. Distributed Breakout Revisited. Proceedings of the Eighteenth National Conference on Artificial Intelligence, Edmonton, Alberta, Canada, 2002. Menlo Park, CA, USA, American Association for Artificial Intelligence:352~357
    104 W. R. Stevens. TCP/IP Illustrated, Volume 1: The Protocols. Addison-Wesley. 1994
    105 H. Balakrishnan, V. Padmanabhan, S. Seshan, et al. A Comparison of Mechanisms for Improving TCP Performance over Wireless Links. IEEE/ACM Transactions on Networking. 1997, 5(6):756~769
    106 J. Padhye, V. Firoiu, D. Towsley, et al. Modeling TCP Throughput: A Simple Model and Its Empirical Validation. Proceedings of SIGCOMM Symposium on Communications Architectures and Protocols. ACM, 1998:303~314
    107 M. Yokoo, K. Hirayama. Algorithms for Distributed Constraint Satisfaction: A Review. Autonomous Agents and Multi-Agent Systems. 2000, 3(2):198~212
    108 王秦辉, 陈恩红, 王煦法. 分布式约束满足问题研究及其进展. 软件学报. 2006, 17(10): 2029~2039
    109 M. Yokoo, K. Hirayama. Distributed Constraint Satisfaction Algorithm for Complex Local Problems. Proceedings of International Conference on Multiagent Systems, Paris, France, 1998: 372~379
    110 R. E. Korf. Depth-first Iterative-deepening: An Optimal Admissible TreeSearch. Artificial Intelligence. 1985, 27(1): 97~109
    111 E. C. Freuder, M. J. Quinn. Taking Advantage of Stable Sets of Variables in Constraint Satisfaction Problems. Proceedings of the International Joint Conference of AI, 1985:1076~1078
    112 K. Hirayama, M. Yokoo. Distributed Partial Constraint Satisfaction Problem. Principles and Practice of Constraint Programming. 1997: 222~236
    113 K. M. Chandy, L. Lamport. Distributed snapshots: Determining Global States of Distributed Systems. ACM Transactions on Computer Systems. 1985, 3(1):63~75
    114 C. Fernandez, R. Bejar, B. Krishnamachari, etc. Communication and Computation in Distributed CSP Algorithms. Proceedings of the 8th International Conference on Principles and Practice of Constraint Programming. London, UK. Springer-Verlag, 2002:664~679
    115 M. Tambe. Towards Flexible Teamwork. Journal of Artificial Intelligence Research. 1997, 7:83~124
    116 K. Sycara, G. Sukthankar. Literature Review of Teamwork Models. Carnegie Mellon University, Robotics Institute, Technical Report CMU-RI-TR-06-50, 2006
    117 S. A. Rehfeld, F. G. Jentsch, M. Curtis, et al. Collaborative Teamwork with Unmanned Ground Vehicles in Military Missions. Proceedings of 11th Annual Human-Computer Interaction International Conference, Las Vegas, NV, 2005
    118 M. Sierhuis, J. Bradshaw, A. Acquisti, et al. Human-Agent Teamwork and Adjustable Autonomy in Practice. Proceedings of 7th International Symposium on Artificial Intelligence (I-SAIRAS), 2003.
    119 P. Scerri, D. Pynadath, L. Johnson, et al. A Prototype Infrastructure for Distributed Robot-Agent-Person Teams. Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), Melbourne, Australia, 2003. New York, USA, ACM Press:433~440
    120 D. Xu, R. Volz, M. Miller, et al. Human-agent Teamwork for Distributed Team Training. Proceedings of the 15th IEEE International Conference onTools with Artificial Intelligence, Washington, DC, USA, 2003:602
    121 N. Schurr, J. Marecki, P. Scerri, et al. The DEFACTO system: Training Tool for Incident Commanders. Proceedings of the Seventeenth Conference of Innovative Applications of Artificial Intelligence Conference, 2005:1555~1562
    122 D. Traum, J. Rickel, J. Gratch, et al. Negotiation over Tasks in Hybrid Human-Agent Teams for Simulation-based Training. Proceedings of the second International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), Melbourne, Australia, 2003. New York, USA, ACM Press: 441~448
    123 H. Chalupsky et al. Electric Elves: Applying Agent Technology to Support Human Organizations. Proceedings of the Innovative Applications of Artificial Intelligence Conference, 2001: 51~58
    124 Ken Goldberg, Steve Gentner, Carl Sutter, et al. The Mercury Project: A Feasibility Study for Internet Robots. IEEE Robotics&Automation Magazine. 2000, 7: 35~40
    125 R. C. Luo, K. L. Su, S. H. Shen, et al. Networked Intelligent Robots through the Internet: Issues and Opportunities. Proceedings of the IEEE. 2003, 91(3): 371~382
    126 A. B. Koku, A. Sekmen and A. Alford. Towards Socially Acceptable Robots. Proceedings of the 2000 IEEE International Conference on Systems, Man, and Cybernetics (SMC-2000), Nashville, TN, USA, 2000: 894~899
    127 M. H. Hung, K. Y. Chen and S. S. Lin. Development of A Web-Services-based Remote Monitoring and Control Architecture. Proceedings of the 2004 IEEE International Conference on Robotics & Automation, New Orleans, LA, 2004: 1444~1449
    128 L. E. Parker. ALLIANCE: An Architecture for Fault Tolerant Multi-Robot Cooperation. IEEE Transactions on Robotics and Automation. 1998, 14 (2): 220~240
    129 IEEE Standards Board. IEEE 802.11g. 2003
    130 WAVECOM. An Introduction to the SMS in PDU Mode, V1.002. 2000
    131 ETSI GSM 07.07: Digital Cellular Telecommunication System (Phase 2+) AT Command Set for GSM Mobile Equipment (ME). Version 5.3.0. 1997
    132 JSR-30 J2ME Connected, Limited Device Configuration(Final Release). http://jcp.org/ aboutJava/communityprocess/final/jsr030/index.html. 2000
    133 JSR-118 Mobile Information Device Profile 2.0 (Final Release 2). http://jcp.org/aboutJava/ communityprocess/final/jsr118/index.html. 2002
    134 J. Davin, P. Riley and M. Veloso. CommLang: Communication for coachable Agents. RoboCup-2004: The Seventh RoboCup Competitions and Conferences. Springer Verlag, Berlin, 2005:46~59
    135 赵杰, 姜健, 臧希喆. 基于强化学习的未知环境多机器人协作搜集. 计算机工程与应用. 2007, 43(10): 19~21
    136 苏治宝, 陆际联. 多移动机器人队形控制的研究方法. 机器人. 2003, 25(1): 88~91

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700