视频目标跟踪算法研究及应用

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

视频目标跟踪算法研究及应用

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research and Applications of Video Object Tracking
作者：韩晓波
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：智能视频监控 ; 目标跟踪 ; Mean ; Shift ; MAP-SP ; Mean ; Shift
英文关键词：intelligent video surveillance ; object tracking ; Mean Shift ; MAP-SP Mean Shift
学位年度：2010
导师：李厚强
学科代码：081002
学位授予单位：中国科学技术大学
论文提交日期：2010-05-20

摘要

智能视频监控是计算机视觉的重要研究领域,在公共安全、政府、金融及教育等方面具有广阔的应用前景。智能视频监控是在不需要人为干预的情况下,利用图像处理、机器学习和计算机视觉等领域的技术对视频进行分析,自动实现目标的检测、跟踪与识别,并在此基础上实现更高层的行为理解与描述。目标跟踪是智能视频监控中的关键技术,是目标识别和行为理解的基础,具有广泛的研究和应用价值。
     目前学术界和工业界在目标跟踪方面做了大量研究,提出了不少有价值的算法,其中Mean Shift以其简单、实用、实时等特点得到了广泛的使用。不过,该算法仍然存在不少问题,比如背景和目标相似、严重的局部遮挡等。本文在前人工作的基础上,为解决这些问题做出了如下贡献:
     1.针对Mean Shift存在的不足提出了一种快速有效的目标跟踪算法-MAP Spatial Pyramid Mean Shift。该算法能有效地把背景信息融合到Mean Shift跟踪框架内,并在跟踪过程中对目标进行动态分块,从而保留了一定的几何结构信息。实验证明该算法能够解决背景和目标相似以及严重的局部遮挡。
     2.研究了基于局部特征的目标跟踪算法。针对传统局部特征描述子SIFT复杂度高提出了一种新的局部特征描述子,并把它融合到MAP-SP Mean Shift算法中。该局部特征描述子结构简单。实验结果证明,在不影响实时性的情况下能有效地提高目标的跟踪效果。
     3.设计并完成了一个实验性的智能视频监控系统。该系统采用模块化设计,由目标检测、目标分析和目标跟踪三个基本模块组成,并在智能视频监控的常用视频序列和自采集的视频序列上进行了测试,取得了较好的实验结果,为以后研究工作的实验测试提供了平台。
Intelligent video surveillance is one of the most important research domains in computer vision, and plays a key role in security protection and military protection. Intelligent video surveillance aims to achieve automatic detection, tracking and recognition of objects using image processing, machine learning and computer vision technologies. Moreover, it will help to analyze and understand the behavior of moving objects. In intelligent video surveillance, object tracking lends itself as the basis of object recognition and behavior analysis, and its performance influences the whole system directly. Therefore, research of object tracking is of great significance in theory and application.
     To date, many researchers from both academia and industry have made great efforts and proposed a lot of valuable object tracking algorithms. Of them, Mean Shift and Particle Filter are two of the most mature and useful in practice. However, there still exist some drawbacks for these two algorithms. For example, they may not work well in cases that background is similar to target, or there is severe partial occlusion, which are prevalent in practical applications. Consequently, it is still difficult to design practical and effective object tracking algorithm. Based on the proposed algorithm, this dissertation studies the object tracking algorithm, and makes contributions as follows:
     Mean Shift object tracking algorithm is investigated and its disadvantages are analyzed. Then, a rapid and efficient object tracking algorithm-MAP Spatial Pyramid(MAP-SP)Mean Shift is proposed in this dissertation. The proposed algorithm considers the background information into Mean Shift framework and divides the target dynamically in the tracking process, to adaptively keep geometric structure.
     Local feature based object tracking algorithm is also investigated. A new local feature descriptor is proposed to avoid the high complexity of traditional local feature descriptor SIFT. This new descriptor is introduced into the MAP-SP Mean Shift framework to improve tracking performance. The proposed descriptor is simple and easy to implement. Therefore, it will improve the performance of tracking in real-time demand.
     Experimental results demonstrate that the proposed approaches can overcome some drawbacks of existing algorithms, satisfy real-time demand and improve the performance of tracking.
     This dissertation designs and completes an experimental video surveillance system based on the proposed algorithms. The system adopts the module design, consisting of motion detection, object analysis and object tracking. The efficiency of this system is demonstrated via comparative experiments on both standard and our own video sequences, providing an experimental platform for the latter research.

引文

[1]冈萨雷斯,数字图像处理(第二版),电子工业出版社,2003.
    [2] K. R. Castleman. Digital Image Processing.Prentice Hall Press,1996.
    [3] D. H. Ballard and C. M. Brown. Computer vision,Prentice Hall Press,1982.
    [4] D. A. Forsyth and J. Ponce. Computer vision: a modern approach. Prentice Hall Press, 2008.8
    [5] Collins R et al. A system for video surveillance and monitoring: VSAM final report. Carnegie Mellon University, Technical Report: CMU-RI-TR-00-12, 2000.
    [6] I. Haritaoglu, D. Harwood and L. Davis. W4: real-time surveillance of people and their activities. IEEE Trans Pattern Analysis and Machine Intelligence, 2000, 22 (8): 809-830.
    [7] W. Hu et al., A Survey on Visual Surveillance of Object Motion and Behaviors, IEEE Trans. Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2004, 34(3): 334–352.
    [8] T. Zhao, M. Aggarwal, R. Kumar and H. Sawhney. Real-time Wide Area Multi-Camera Stereo Tracking, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, 1(1): 976-983.
    [9] G. R. Bradski. Computer vision face tracking for use in a perceptual user interface, In Proc. IEEE Workshop Applications of Computer Vision, 1998, 214-219.
    [10] J. Tua, H. Taob, and T. Huang. Face as mouse through visual face tracking, CVIU, 2007, 108(1–2):35–40.
    [11] V. Keanaker and R. Zabih. Bayesian Multi-Camera Surveillance, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999, 1(2):253-259.
    [12] A. Elgammal, R. Duraiswami, D. Harwood, et a1. Background and foreground modeling using nonparametric kernel density estimation for visual surveillance, Proceedings of the IEEE, 2002, 90(7):1151-1163.
    [13] R. Babu, P. Patrick, fez, et a1. Robust tracking with motion estimation and local Kernel-based color modeling, Image Vision Computing, 2007, 25(8):1205-1216.
    [14] A. Yilmaz, O. Javed and M. Shah. Object Tracking: A Survey, ACM Journal of Computing Surveys, 2006, 38(4).
    [15] Lucas and T. Kanade, An iterative image registration technique with an application to stereo vision, In Proceedings of the International Joint Conference on Artificial Intelligence, 1981, 674–679.
    [16] J. b. Shi, C. Tomasi. Good feature to track, In Proc. IEEE Conf. Comp. Vision and Pattern Recognition, 1994, 593-600.
    [17] D. Comaniciu, V. Ramesh and P. Meer. Kernel-based object tracking, IEEE Trans. Pattern Anal. Mach.Intell., May 2003, 25(5):564–577.
    [18] K. Nummiaro, E. Koller-Meier, and L.V. Gool. An Adaptive Color-Based Particle Filter, Image and Vision Computing, 2003, 21:99-110.
    [19] D.Comaniciu and P Meer. MS analysis and applications. IEEE International Conference on Computer vision, 1999, 1(2):1197-1203.
    [20] X. Han, P. Zhang and H. Li. MAP Spatial Pyramid Mean Shift for object tracking. SPIE Visual Communications and Image Processing, 2010.
    [21] R. Collins. Mean shift blob tracking through scale space, In Proc. IEEE Conf. Comp. Vision Pattern Recognition, 2003, 2: 234–240.
    [22] H. Zhou, Y. Yuan, C. Shi. Object tracking using SIFT features and mean shift, CVIU, 2009.
    [23] B. Bing, Z. Bing, J. Qiu. Tracking Object by Combining Particle Filters and SIFT Features, ICIG, 2009.
    [24] M. Kass, A. Witkinm and D. Terzopoulos. Snake: Active Contour models, Internatioanl Journal on Computer Vision, 1998, 1(4):321-331.
    [25] M. Isard, A. Blake. Contour tracking by stochastic propagation of conditional density, ECCV, 1996, 343--356.
    [26] A. Yilmaz. Object tracking by asymmetric kernel mean shift with automatic scale and orientation selection, In Proc. IEEE Conf. Comp. Vision Pattern Recognition, 2007, 1-6.
    [27] H. Grabner, M. Grabner, and H. Bischof. Real-time tracking via online boosting. In BMVC, 2006, 47–56.
    [28] D. Ross, J. Lim, R. Lin and M. Yang. Incremental learning for robust visual tracking, IJCV, May 2008, 77(1):125–141.
    [29] H. Grabner, C. Leistner, and H. Bischof. Semi-supervised on-line boosting for robust tracking. In ECCV, 2008.
    [30] B. Babenko, M. Yang and S. Belongie. Visual Tracking with Online Multiple Instance Learning. In Proc. IEEE Conf. Comp. Vision Pattern Recognition 2009.
    [31] K. Fukunaga and L. D. Hostetler. The estimation of the gradient of a density function, with applications in pattern recognition, IEEE Trans. Inform. Theory, 1975, 21(1):32-40.
    [32] Y. Cheng. Mean shift, mode seeking, and clustering, IEEE Trans. Pattern Anal. Mach. Intell., 1995, 17(8):790–799.
    [33] D. Comaniciu and P. Meer. Mean shift: A robust approach toward feature space analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24:603-619.
    [34] R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis, Wiley, 1973.
    [35] T. Kailath. The Divergence and Bhattacharyya Distance Measure in Signal Selection, IEEE Trans. Communication Technology, Feb 1967, 15(1): 147-151.
    [36] H. P. Moravec. Towards Automatic Visual Obstacle Avoidance, Proc. 5th International Joint Conference on Artificial Intelligence, 1977.
    [37] H. P. Moravec. Visual Mapping by a Robot Rover, Int. Joint Conf. on Artificial Intelligence, 1979, 598-600.
    [38] H. Moravec. Rover visual obstacle avoidance. In Int. Joint Conf. on Artificial Intelligence, Vancouver, Canada, 1981, 785–790.
    [39] C. Harris and M. Stephens. A Combined Corner and Edge Detector. Proc. Alvey Vision Conf., Univ. Manchester, 1988, 147-151.
    [40] M. Trajkovic and M. Hedley. Fast Corner Detection. Image and Vision Computing, 1998, 16(2):75-87.
    [41] F. Mohanna and F. Mokhtarian. Performance Evaluation of Corner Detection Algorithms under Similarity and Affine Transforms, In BMVC, 2001.
    [42] D. G. Lowe. Object recognition from local scale-invariant features, International Conference on Computer Vision, Corfu, Greece, Sep 1999, 1150-1157.
    [43] D. G. Lowe. Distinctive image features from scale-invariant key points, International Journal of Computer Vision, 2004, 60(2):91-110.
    [44]T. Lindeberg. Feature detection with automatic scale selection, International Journal of Computer Vision, 1998, 30(2):77-116.
    [45] T. Lindeberg. Scale-space, In: Encyclopedia of Computer Science and Engineering (Benjamin Wah, ed), John Wiley and Sons, Jan 2009, 4:2495-2504.
    [46] M. Stricker and M. Orengo. Similarity of color images[J], SPIE Storage and Retrieval for Image and Video Databases III, Feb. 1995, 2185:381-392.
    [47] J. Mao and A. K. Jain. Texture classification and segmentation using multiresolution simultaneous autoregressive models[J], Pattern Recognition, 1992, 25(2):173-188.
    [48] C. Yang, R. Duraiswami and L. Davis. Efficient mean-shift tracking via a new similarity measure, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, 1:176-183.
    [49] Lipton A, Fujiyoshi H and Patil R. Moving target classification and tracking from real-time video[J]. Proc.of the Fourth IEEE on WACV 1998,98:8-14.
    [50] Meyer D, Denzler J and Niemann H. Model based extraction of articulated objects in image sequences for gait analysis [C]. In:Proc IEEE International Conference on Image Processing,Santa Barbara,California 1997,78-81.
    [51] Kilger M. A shadow handler in a video-based real-time traffic monitoring system[C]. IEEE Proc on Applications of Computer Vision.Palm Springs,CA,1992:11-18.
    [52]Michael Harville, Hewlett-Packard Laboratories. A Framework for High-Level Feedback to Adaptive, Per-Pixel, Mixture-of-Gaussian Background Models[A].In the 7th European Conference on Computer Vision [C], May 28-31, 2002.
    [53] P.Kaewtrakulpong, R. Bowden, An improved adaptive background mixture model for real-time tracking with shadow detection, In 2nd European Workshop on Advanced Video Based Surveillance Systems, 2001.
    [54] J. Pilet, C. Strech and P. Fua, Making background subtraction robust to sudden illumination changes, In Proc. European Conf. on Comp. Vision, 2008, 567-580.
    [55] Amit Adam, Ehud Rivlin, Ilan Shimshoni. Robust Fragments-based Tracking using the Integral Histogram, In IEEE Conf. Computer Vision and Pattern Recognition, 2006.
    [56] Lazebnik.S, Schmid.C, Ponce.J. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, In Proc. IEEE Conf. Comp. Vision Pattern Recognition, 2006, (2):2169-2178.
    [57] Junseok Kwon and Kyoung Mu Lee. Tracking of a Non-Rigid Object via Patch-based Dynamic Appearance Modeling and Adaptive Basin Hopping Monte Carlo Sampling, In Proc. IEEE Conf. Comp. Vision and Pattern Recognition, 2009.
    [58] H. Jegou, M. Douze and C.Schmid. Hamming embedding and weak geometric consistency for large scale image search, In Proc. European Conf. on Comp. Vision, 2008.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700