复杂环境下视频目标跟踪技术的算法和应用研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

复杂环境下视频目标跟踪技术的算法和应用研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research on Algorithm and Application of Visual Object Tracking Under Complex Envirionment
作者：薛陈
论文级别：博士
学科专业名称：机械电子工程
中文关键词：视频目标跟踪 ; 均值漂移 ; 质心加权 ; 遮挡 ; 电视跟踪器
英文关键词：Visual target tracking ; Mean-shift ; Centroid weighted ; Occlusion ; Video tracker
学位年度：2010
导师：朱明
学科代码：080202
学位授予单位：中国科学院研究生院（长春光学精密机械与物理研究所）
论文提交日期：2010-04-10
答辩委员会主席：马驷良

摘要

视频目标跟踪是计算机视觉领域的关键技术之一,在民用和军事的诸多领域中都具有极为广阔的应用前景,包括智能监控、基于视觉的人机交互、智能交通、机器人视觉导航、精确制导系统等。随着信息技术的飞速发展,越来越多的研究人员投身于视频目标跟踪的研究领域,提出了很多优秀的算法,这些算法在某些特定的场合下取得了良好的效果。尽管如此,研究出一套鲁棒的,能适应各种复杂环境(如复杂的背景、目标外观的变化和目标的遮挡等)的跟踪算法并且予以工程实现,仍然存在很多困难。
     根据工程需要,本文主要对复杂环境下的视频目标跟踪技术进行了算法的研究和工程应用的研究,主要工作如下:
     (1)深入研究了均值漂移算法,针对传统均值漂移算法中由于背景像素造成的目标定位偏差的问题,提出了基于最优灰度直方图特征的改进均值漂移算法。改进的算法根据初始帧目标和背景在灰度分布上的差异,建立对数似然图(log-likelihood image),筛选出与背景可区分性好的最优灰度直方图特征进行目标建模,并且以同样的方法在后续帧建立候选模型。改进后的算法能够有效减轻背景像素对目标定位的影响,提高目标的跟踪精度,同时减少算法的迭代次数,提高算法的运算速度。
     (2)提出了质心加权算法并予以改进。该算法通过求跟踪区域内同一灰度级所有像素的质心位置的数学期望获得目标的最终位置。除了目标的灰度统计信息外,算法还包含了灰度分布的空间信息,目标特征描述更加丰富,目标定位更加准确;算法只需一步计算,无需迭代,实时性好;较改进的均值漂移算法,改进的质心加权算法对目标遮挡有更好的鲁棒性。同时提出了目标跟踪过程中的模板更新策略,增强了算法的鲁棒性。
     (3)针对目标的遮挡问题进行了深入研究,提出了完整的目标跟踪算法。首先提出了基于边缘加权的Bhattacharyya系数;该系数对目标的遮挡非常敏感,能有效判断出目标的遮挡时刻。算法以卡尔曼滤波为基本的跟踪框架。遭遇遮挡时,根据遮挡程度的不同,采取不同的处理策略。对于部分遮挡,不做特殊处理,本文提出的改进的质心加权算法完全能够克服部分遮挡;对于严重遮挡,采取基于分块的目标跟踪算法;全部遮挡情况下,采用卡尔曼滤波的预测值作为目标的位置,同时停止对卡尔曼滤波的修正。算法对目标的部分遮挡、严重遮挡和完全遮挡都有很好的鲁棒性。
     (4)研制了以DSP+FPGA为基本架构的电视跟踪器。硬件平台中,解决了低温工作问题、FPGA配置问题和电磁兼容问题;软件系统中,完成了DSP程序的编写和优化,实现了对光照变化鲁棒的相关算法的DSP移植。该电视跟踪器已经通过了环境测试,工作可靠稳定,跟踪效果好,实时性强,满足各项指标和要求,现已应用于实际工程项目当中。
As one of the crucial issues of computer vision, visual object tracking is widely used in many applications, such as visual surveillance, human-computer interaction, visual transportation, visual navigation of robots, military guidance, etc. Along with the rapid development of information techniques, more and more researchers devoted themselves to the research area of visual object tracking, and many effective algorithms have been proposed, some of which have great performance under certain environment. However, there are still many difficulties to the research and application of robust algorithm, due to the complex environment, such as complex background, change of the appearance, and occlusion, etc.
     This dissertation includes two parts: firstly, the research of robust tracking algorithm under complex environment, and secondly, the realization of tracking algorithm based on hardware platform. The main contributions of the dissertation are summarized as follows:
     (1) Tracking algorithm based on Mean-shift is deeply discussed. Due to the background pixels, the traditional Mean-shift algorithm can not locate the object exactly. Improved Mean-shift algorithm based on the most discriminative grey level features is proposed. According to the difference of grey distribution between the object and the background in the initial frame, log-likelihood image is set up to select the discriminative grey level features for object modeling. The candidate modeling is done the same way in the next frames. The improved Mean-shift algorithm may not only reduce the impact of the background pixels to object localization and increase the precision of localization, but also reduce the iteration times of the algorithm, and increase the speed of computation.
     (2) Centroid weighted algorithm is proposed and improved in this dissertation. The ultimate location of the object is the expectation of the centroids of the pixels of the same grey level in the tracking area. The centroid weighted algorithm has three advantages. Firstly, the algorithm includes spatial information of the color distribution besides the statistical information, which makes it more precise. Secondly, it is very simple and needs only one step computation without iterations, which makes it very suitable for real-time application. Thirdly, rather than the improved Mean-shift algorithm, the improved centroid weighted algorithm is more robust, when partial occlusion happens. On the other hand, the model updating strategy is proposed, which makes the tracking algorithm more robust.
     (3)Occlusion problem is deeply discussed in the dissertation and integrated algorithm of object tracking is proposed. Bhattacharyya coefficient is proposed, which is very sensitive to occlusion. Kalman filter is the main framework of tracking. According to the degree of occlusion, different strategies are used. The proposed centroid weighted algorithm is robust to partial occlusion, so no special treatment is needed to partial occlusion. Fragments based algorithm is used when serious occlusion happens. When totally occluded, the predicted location of the Kalman filter is chosen as the object location. The strategy of dealing with occlusion is robust to partial occlusion, serious occlusion and total occlusion.
     (4)Video tracker is developed based on the framework of“DSP+FPGA”. Three problems of the hardware are resolved, and they are object tracking in low temperature, the configuration of FPGA and EMC of the hardware. The code of DSP is written and optimized, and robust tracking under variable light condition is realized. The video tracker has now already passed the environment test, and the performance of both hardware and software all meet with the requirements, such as stability, reliability and real-time, etc.

引文

[1].孔旭黎,陈悦.生理学[M].第三版.郑州:郑州大学出版社,2008.132-137.
    [2].张广军.机器视觉[M].北京:科学出版社,2005.1,50-53.
    [3]. Hu W,Tan T,Wang L,et al.A survey on visual surveillance of object motion and behaviors[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews,2004,34(3):334-352
    [4]. Mohamed F A,Rama C,Zheng Q.Integrated motion detection and trackingfor visual surveillance[C].Proceedings of the Fourth IEEE International Conference on Computer Vision System(ICVS),2006
    [5]. Elgammal A,Duraiswami R,Harwood D,et al.Background and foreground modeling using nonparametric kernel density estimation for visual surveillance[C].Proceedings of the IEEE.2002,90(7):1151-1163
    [6]. Collins R T,LiPton A J,Kanade T. Introduction to the special section on video surveillance[J].IEEE Transactions on Patern Analysis and Machine Intelligence, 2000, 22(8): 745-746.
    [7]. Allen J M,Asselin P K,Foulds R.American Sign Lanuage finger spelling recognition system[C]. 2003 IEEE 29th Annual Proceeding of Bioengineering Conference,2003,285-286.
    [8]. Amin M A,Hong Y.Sign Language Finger Alphabet Recognition from Gabor-PCA Representation of Hand Gestures[C].2007 International Conference on Machine Learning and Cybernetics,2007,2218-2223.
    [9]. Bradski G R. Computer Vision Face Tracking as a Component of a Perceptual User Interface[J]. IEEE Workshop Applications of Computer Vision, 1998, 214-219.
    [10]. Pai C J,Tyan H R,Liang Y M,et al. Pedestrian detection and tracking at crossroads[J]. Pattern Recognition,2004,37(5):1025-1034
    [11]. Hsieh J,Yu S,Chen Y,et al.Automatic traffic surveillance system for vehicle tracking and classification[J].IEEE Transactions on Intelligent Transportation systems, 2006, 7(2): 175-187.
    [12]. Betke M, Haritaoglu E, Davis L S. Real-time multiple vehicle detection and tracking from a moving vehicle[J]. Machine Vision and Applications. 2000,12(2):69-83
    [13]. Francisco B F, Alberto O, Gabriel O. Visual Navigation for Mobile Robots: A Survey[J]. Journal of Intelligent and Robotic Systems, 2008,53(3):263-296
    [14]. Li T, Shih J C. Autonomous fuzzy parking control of a car-like mobile robot[J]. IEEE Transactions on Systems,Man and Cybernetics,2003,33(4):451-465
    [15].丁晋俊,李志刚.传感器网络在战场目标定位跟踪中的应用[J].电子对抗,2006, 108:22-25.
    [16].姚秀娟,彭晓乐,张永科.几种精确制导技术简述[J].激光与红外, 2006, 36(5):338-340.
    [17]. Tang C W. Spatiotemporal Visual Considerations for Video Coding[J]. IEEE Transactions on Multimedia, 2007, 9(2):231-238.
    [18]. Agrafiotis D, Davies S J C, Canagarajah N, et al.Towards efficient context-specific video coding based on gaze-tracking analysis[J]. ACM Transactions on Multimedia Computer Commun, Applications, 2007,3(4):1-15.
    [19]. Bardinet E, Cohen L D, Ayache N.Tracking medical 3D data with a parametric deformable model[C],Proceedings of International Symposium on Computer Vision, Ispra, Italy, 1995:299-304.
    [20]. Guerrero J, Salcudean S E, McEwen J A, et al. Real-Time Vessel Segmentation and Tracking for Ultrasound Imaging Applications[J].IEEE Transactions on Medical Imaging, 2007,26(8):1079-1090.
    [21]. Remondino F. 3-D reconstruction of static human body shape from image sequence[J]. Computer Vision and Image Understanding, 2004,93(1):65-85.
    [22]. Hu W, Xie D, Fu Z, et al. Semantic-Based Surveillance Video Retrieval[J]. IEEE Transactions on Image Processing, 2007,16(4):1168-1181.
    [23]. Jiang Y Z, Murata A. Acquiring a complete 3D model from specular motion under the illumination of circular-shaped light sources[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8):913-920.
    [24]. Roch R F,Evnas D C. ATRAN Terrain Sensing Guidanee the Grand-daddy System[C]. Proceedings of the Soeiety of Photo-optical Instrumentation Engineers,Image Poreessing for Missile Guidance,San Siege California,1950,238:2-9.
    [25]. Eppler W M,Drius G. Two Dimensional Photographic Autocorrelation of Pictures and Alphabet Letters[C]. Proceedings of the 3rd London Symposium On Information Theory, NewYork, 1956:34-36.
    [26]. Willet T J. Intelligent traeking techniques[C].1979,AD-AO96317.
    [27]. Kashef B,Sawchuk A A. A Survey of New Techniques for Image Registration and Mapping[C]. Proceedings of SPIE, 1983:432-438.
    [28]. Collins R T,LiPton A J,Kanade T.A System for video Video surveillance and monitoring: VSAM final report[R]. CMU-RI-TR-00-12, Camegie Melon Univerdity, Pittsburgh, America, 2000.
    [29]. Haritaoglu I, Davis Larrys, et al. W4 who? when? where? what? A real time system for detecting and tracking people[C] . Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan.1998,222-227.
    [30].万缨,韩毅,卢汉清.运动目标检测算法的探讨[J].计算机仿真. 2006, 23(10):221-226.
    [31]. Lipton A J, Fujiyoshi H, Patil R. Moving target classification and tracking from real-time video[C]. Proceedings of IEEE Workshop on Applications of Computer Vision, Princeton, NJ, 1998, 8-14.
    [32].代科学,李国辉,涂丹,等.监控视频运动目标检测减背景技术的研究现状和展望[J].中国图象图形学报,2006,11(7):919-927.
    [33]. Christopher R W, Ali A, et al. Pfinder: Real-Time Tracking of the Human Body[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(7):780-785.
    [34]. Stauffer C, Grimson E. Learning Patterns of Activity Using Real Time Tracking[J]. IEEE Transactions on Pattern Recognition and Machine Intelligence(TPAMI), 2000, 22(8): 747-757.
    [35]. Elgammal A, Harwood D, Davis L. Non-parametric Model for Background Subtraction[C]. Proceedings of ICCV’99 Framerate Workshop,1999.
    [36]. Oliver N M, Rosario B, Pentland A P. A Bayesian Computer Vision System for Modeling Human Interactions[J]. Pattern Analysis and Machine Intelligence, 2000, 22(8):83 843.
    [37]. Han B, Comaniciu D, Davis L. Sequential Kernel Density Approximation through Mode Propagation: Applications to Background Modeling[C]. Proceedings of ACCV Asian Conf. on Computer Vision, 2004.
    [38]. Markandey V, Reid A, Wang S. Motion estimation for moving target detection[J].IEEE Transactions on Aerospace and Electronic Systems,1996,32(3):866-874.
    [39]. Bieeling M. Displacement estimation by hierarchical block matching[J]. SPIE Visual Commuand Image Processing. VCIP’88, Cambridge, MA, 1988:942-951.
    [40]. Murray DW, Buxton B E. Scene segmentation from visual motion using global optimization[J]. IEEE Trans.Pattern Analysis and Machine Intelligence,1987,9:220-228.
    [41]. Meyer F, Bouthemy P. Region-based tracking using affine motion models in long image sequences[C].CVGIP:Image understanding, 1994, 60(2): 119-140.
    [42]. Bascle B, Deriche R.Region tracking through image sequence[C]. Proceedings of IEEE International Conference of Computer Vision, 1995, 302-307.
    [43]. Bascle B, Deriche R. Region tracking through image sequence[C]. Proceedings of IEEE International Conference of Computer Vision, 1995, 302-307.
    [44]. Moravec H P. Towards automatic visual obstacle avoidance[C]. Proceedings of the 5th International Joint Conference on Artificial Intelligence, 1997.
    [45].陈爱华.复杂环境下多模式融合的视频跟踪算法研究[D]:[博士学位论文].长春:中国科学院长春光学精密机械与物理研究所,2009.
    [46]. Kass M, Witkin A, Terzopoulos D. Snakes: Active contour models[J]. Internantional Journal of Computer Vision 1988, 1(4):321-331.
    [47]. Vieren C, Cabestaing F,Postaire J.Catching moving objects with snakes for motion tracking[J]. Pattern Recognition Letters, 1995,16(7):679-685.
    [48]. Malladi R, Sethian J A, Vemuri B C. Shape modeling with front propagation:A level set approach[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995, 17(2):61-79.
    [49]. Karaulava I, Hall P and Marshall A. A hierarchical model of dynamics for tracking people with a single video camera[C]. Beitish Machine Visioin Conference, Bristol, UK, 2000, 352-361.
    [50]. Shalom Y Bar, Fortman T. Tracking and Data Association[M]. New York: Acadenmic Press,1998.
    [51]. Huang C M, Liu D, Fu L C. Visual Tracking in Cluttered Environments Using the Visual Probabilistic Data Association Filter[J]. IEEE Transactions on Robotics, 2006, 22(6): 1292-1297.
    [52]. Williams O.,Blake A.,Cipolla R. Sparse Bayesian learning for efficient visual tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(8):1292-1304.
    [53]. Kalman, R. E. A New Approach to Linear Filtering and Prediction Problems[J]. Transaction of the ASME-Journal of Basic Engineering, 1960,35-45.
    [54]. Isard M, Blake A. CONDENSATION-conditional density propagation for visual tracking[J]. International Journal of Computer Vision, 1998, 29(1):5-28.
    [55]. Wu Y, Huang T S. Robust Visual Tracking by Integrating Multiple Cues Based on Co-Inference Learning[J]. International Journal of Computer Vision, 2004, 58(1):55-71.
    [56]. Cannons K, Wildes R. Spatiotemporal Oriented Energy Features for Visual Tracking[R]. York University, North York, Canada, 2007.
    [57]. Jepson A D, Fleet D J, El-Maraghi T F. Robust online appearance models for visual tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(10):415-522.
    [58]. Peng N S, Yang J, Liu E Q. Model update mechanism for mean-shift tracking[J]. Journal of systems Engineering and Electronics, 2005, 16(1):52-57.
    [59]. Senior A, Hampapur A,Tian Y L, et al. Appearance models for occlusion handling[J]. Image and Vision Computing, 2006, 24(11):1233-1243.
    [60]. Yilmaz A, Xin L, Shah M. Contour-based object tracking with occlusion handling in video acquired using mobile cameras[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004,26(11):1531-1536.
    [61]. Tao Y, Quan P, Jing L, et al. Real-time multiple objects tracking with occlusion handling in dynamic scenes[C]. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA: IEEE, 2005:970-975.
    [62]. Fukanaga K, Hostetler L D. The estimation of the gradient of a density function, with application in pattern recognition[J]. IEEE Trans. On Information Theory. 1975, 21(1):32-40.
    [63]. Cheng Y. Mean shift, mode seeking and clustering[J]. IEEE Transactions on pattern analysis and machine intelligence, 1995, 17(8):790-799.
    [64]. Comaniciu D, Meer P. Mean shift: A robust approach toward feature space analysis[J]. IEEE Trans Pattern Anal Machine Intell, 2002, 24(5):603-619.
    [65]. Comaniciu D, RamesV h, Meer P. Real-Time tracking of non-rigid objects using mean shift[J]. IEEE Conference on Computer Vision and Pattern Recognition, 2000:142-149.
    [66]. Comaniciu D, Ramesh V, Meer P. Kernel-based object tracking[J]. IEEE Transaction on pattern analysis and machine intelligence, 2003,25(5) :564-577.
    [67].彭宁嵩,杨杰,刘志,等.Mean-Shift跟踪算法中核函数窗宽的自动选取[J].软件学报,2005, 16(9):1542-1550.
    [68]. Yang C J, Duraiswami R, Davis L. Efficient mean-shift tracking via a new similarity measure[C]. IEEE Conf. on Computer Vision and Pattern Recognition, 2005, 1:176-183.
    [69]. Shan C F, Wei Y C, Tan T N, et al. Real Time Hand Tracking by Combining Particle Filter and Mean shift[C]. Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004:669-674.
    [70]. Xiang C, Fan X A, Lee T H. Face recognition using recursive Fisher linear discriminant[J]. IEEE Transactions on Image Processing, 2006, 15(8): 800- 804.
    [71]. Maggio E, Cavellaro A. Hybrid Particle Filter and Mean Shift Tracker with Adaptive Transition Model[J]. Acoustics, Speech, and Signal Processings (ICASSP’05), 2005,2:221- 224.
    [72]. Kullback S, Leibler R A. On information and sufficiency[J]. Annals of Mathematical Statistics,1951, 22:79-86.
    [73]. Collins R T, LIU Y X. On-line selection of discriminative tracking features [C]. Proceedings of the Ninth IEEE International Conference on Computer Vision, IEEE, 2003, 346-352.
    [74].陆宗骐. C/C++图像处理编程[M],北京:清华大学出版社, 2005,237-245.
    [75]. Mennon A, Mehrotra K, Mohan C K, et al. Characterization of a class of sigmoid function with applications to neural networks[J]. Neural Networks,1996,9(5):819-835.
    [76].同济大学概率统计教研组.概率统计[M],上海:同济大学出版社, 2000,46-48,120-121.
    [77]. Matthews I, Ishikawa T, Baker S. The template update problem [J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on Pattern Analysis and Machine Inteligence, 2004, 26:810-815.
    [78].陆光华,彭学愚,张林让,等.随机信号处理[M].西安:西安电子科技大学出版社,2002.
    [79].杨万海.多传感器数据融合及其应用[M].西安:西安电子科技大学出版社, 2004, 52-56.
    [80]. Ernst D D, Volker G. Applications of dynamic monocular machine vision[J]. Machine Vision and Applications, 1988, 1(4):241-261.
    [81]. Donald B G. Visual tracking of known three-dimensional objects[J]. International Journal of Computer Vision,1992,7(3):243-270.
    [82]. Matthies L H, Kanade T, Szeliski R. Kalman Filter-based Algorithms for Estimating Depth from Image Sequences[J].International Journal of Computer Vision, 1989,3(3):209-238.
    [83].郭治.现代火控理论[M].北京:国防工业出版社,1996.
    [84].蒋谱成.电视跟踪在火控雷达中的地位和作用[J].雷达,1987,3:44-49.
    [85].李召瑞.基于DSP的电视跟踪系统的研究[D]. [硕士学位论文].长沙:国防科学技术大学,2002.
    [86]. Altera. CycloneII Device Handbook[Z]. 2008.
    [87].王成,吴继华,范丽珍,等. Altera FPGA/CPLD设计(基础篇)[M].北京:人民邮电出版社,2005, 117-139.
    [88]. Texas Instruments. TMS320C6414T, TMS320C6415T, TMS320C6416T, FIXED-POINTED DIGITAL SIGNAL PROCESSORS[Z].2003.
    [89]. Texas Instruments. TMS320C6000 DSP External Memory Interface (EMIF) Reference Guide[Z].2003.
    [90]. Texas Instruments, TMS320C6000 DSP Enhanced Direct Memory Access (EDMA)
    [91]. Silicon Laboratories. C8051F020/1/2/3 8K ISP MCU Family[Z].2001.
    [92]. Texas Instruments ,TL16C550C, TL16C550CI: Asychronous Communications Element With Autoflow Control[Z].1996.
    [93].薛陈,熊文卓,龙科慧,等.基于CPLD的带刻度十字丝的产生电路[J].电子器件, 2007,30(6):2050-2052.
    [94]. Conexant system Inc. Bt835 VideoStream? III Decoder[[Z]].2001.
    [95]. Philips Semiconductors. SAA7113H 9-bit video input processor[Z].1999.
    [96]. Texas Instruments. Tvp5150 Low-Power Video Decoder with Scaling[Z].2002.
    [97]. Anolog Devices, 10-Bit, 4×Oversampling SDTV Video Decoder[Z].2006.
    [98]. Intersil, EL4581 Sync Separator, 50% Slice, S-H, Filter[Z].2008.
    [99].杨克俊.电磁兼容原理与设计技术.北京:人民邮电出版社,2004,1-47.
    [100].顾海洲,马双武. PCB电磁兼容设计(设计实践).北京:清华大学出版社,2004,1-25.
    [101].李方慧,王飞,何佩琨. TMS320C6000系列DSPs原理与应用(第二版)[M].北京:电子工业出版社,2002.
    [102]. Texas Instruments. TMS320C6000 Optimizing Compiler User’s Guide[Z].2004.
    [103]. Lee C H, Chen L H. A fast motion estimation algorithm based on the block sum pyramid[J].IEEE Trans Image Processing, 1997,6(11):1587-1591.
    [104]. Texas Instruments. TMS320C64X DSP Library Programmer’s Reference[Z].2003.
    [105].毛小波,贾更新.基于定点DSP的浮点开平方算法的实现[J].微计算机信息, 2003,19(4):41-42.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700