基于视觉注意力机制的图像检索方法研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于视觉注意力机制的图像检索方法研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Image Retrieval Based on Visual Attention Mechanism
作者：李艳
论文级别：硕士
学科专业名称：计算机科学与技术
中文关键词：显著区域 ; 视觉注意力机制 ; Sift描述子 ; 相似性 ; 空间布局
英文关键词：Saliency regions ; Visual attention mechanisms ; Sift descriptors ; Similarity ; Spatial layout
学位年度：2010
导师：黄东军
学科代码：081202
学位授予单位：中南大学
论文提交日期：2010-05-01

摘要

近年来,基于内容的图像检索系统(CBIR)是一个热门的研究话题。传统的图像检索系统通常根据图像的底层特征(颜色、纹理和形状等)建立索引进行检索,但这种基于全局的方法在图像的内容的表达上具有一定的局限性,它忽略了图像中不同的区域吸引人眼注意的程度不相同这一事实。后来提出的基于区域的检索方式中,大多数方法依赖于图像分割实现区域的划分,而目前精准图像分割技术仍是难以解决的问题之一,因此导致检索结果不甚理想。
相关研究表明,人眼在观察物体时,总是会把注意力集中到图像中感兴趣的部分,因此针对感兴趣区域进行检索是一种较为有效表达用户意图的检索方式。本文在分析了总结了基于内容的图像检索的发展状况及趋势的基础上,根据近年来人眼心理学中的注意力选择机制,融合Itti-Koch和Stentiford注意力模型,提出一种新的基于图像显著区域(用户感兴趣区域)的检索方法。首先,改善了现有注意力机制模型,使提取的显著区域更加符合人眼观察结果；其次,对获得的感兴趣区域,利用局部结合整体的方式,既考虑区域中所具有的稳定特征,同时充分利用区域的空间布局关系反映图像的整体构成,并结合二者进行检索,克服了传统检索中不能解决的图像旋转、平移、亮度变化等缺点,也充分体现了人眼对事物的认知过程。文中提出的方法可以自动提取图像的感兴趣区域,从而摒弃了采用手工标识的方式选择显著区域,使区域的匹配目标更为明确；另外,以显著区域为线索进行检索,有利于去除背景信息的干扰,使检索直接贴近用户意图。实验表明该方法与传统的基于全局特征进行检索的方式相比,具有更好的检索性能。
In recent years, the content-based image retrieval (CBIR) system is a hot research topic. Traditional image retrieval systems are usually searching with the index which is constructed under the image features such as color, texture and shape etc, but the global approach has some limitations in the expression of the image content, it ignores the fact that the attractive degree is not the same in different regions of the images. In the region-based retrieval methods, most methods of regional division are based on the image segmentation, but at present, the precise technology of the image segmentation is still a problem which is difficult to solve, therefore the search results is not good.
Related studies show that human always focus on interesting parts of the image when they observe objects, so the search for the regions of interest is an effective way to express the search intention of user. Based on the summary and analysis of content-based image retrieval, according to the mechanism of selective attention in the psychology of the human eye in recent years, we combine Itti-Koch with Stentiford attention model, proposing a new retrieval method based on the significant areas in an image, which are interesting to users. First, the existing attention mechanism model is improved to obtain salience regions that accord more with human observations; Second, we combination both the overall and local features, considering the stability features of the salience regions, taking full advantage of interregional relations of mutual location of the overall composition of the image, then combine the two to do retrieval. This approach has overcome the shortcomings of traditional method which can not solve the problem of image rotation, translation, brightness change, and it also reflects the human eye's perception of things process. The presented method can automatically extract the salience regions, rejecting the approach chosen by hand to mark a salience area, thus the extracted regions match the target well; In addition, using salience regions as clues to do retrieval can help remove the influence coming from background, and this is closer to the user intention. Experiments show that the performance of the proposed method is better then traditional way based on global features.

引文

[1]F. Long, H. Zhang, D. Feng, Fundamentals of content-based image retrieval, in: D. Feng, W.C. Stu, H. Zhang (Eds.), Multimedia Information Retrieval and Management, Berlin:Springer,2003:1-26
    [2]Chang, N. S. and Fu, K. S. A relational database system for images. Technical Report TR-EE,1979,10(3):288-321
    [3]Chang, S. K., Pictorial database systems. IEEE Computer,1981,14(1):13-19
    [4]Chang, S. K., Yan, et al. An intelligent image database system. IEEE Transactions on Software Engineering,1988,14(5):681-688
    [5]Niblack W, Berber Flickner M. The QBIC Project:Querying Images by content using color,Texture and Shape. in:Proc. SPIE Conf. on Visual Commun. and 12nage Proc.,1994:203-207
    [6]Smith J R Chang S F. VisualSEEK:a fully automated content-based image query system. In:ACM Multimedia 96,Boston,MA,1996:87-98
    [7]Alex Pentland, Rosalind W. Picard, Stanley Sdarof. Photobook:Content-based manipulation of image databases. International Journal of Computer Vision,1996, 18(3):233-254
    [8]P. Pala and S. Santini. Image retrieval by shape and texture. Pattern Recognition, 1999,32(19):517-527
    [9]Ma W Y and Manjunath B. NaTra:A toolbox for navigating large image databases. Multimedia Systems,1999,7(3):184-198
    [10]Wang J Z, Li J, Wiederhold G. SIMPLIcity:Semantics-sensitive integrated matching for picture librates. Lecture Notes in Computer Science, 2001,1929(2000):171-193
    [11]Srihari R K, Zhang Z and Rao A. Intelligent indexing and semantic retrieval of multimodal documents. Information Retrieval,2000,2(2):245-275
    [12]Moghaddam B, Biermann H, Margaritis D. Defining image content with multiple regions-of-interest. Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries,1999,10(5):89-93
    [13]Das M, Riseman E M and Draper B. FOCUS:searching for multi-colored objects in a diverse image database. the Proc. Of IEEE CVPR'97,2004,94(1-3):168-192
    [14]Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998,20(11):1254-1259
    [15]张鸿斌,陈豫.连接基于内容图像检索技术中的语义鸿沟.情报理论与实践,2004,27(2)：196-198
    [16]章毓晋.基于内容的视觉信息检索.北京：科学出版社,2003
    [17]Porkaew, Kriengkrai,Ortega. Query reformulation for content based multimedia retrieval in MARS. In:International Conference on Multimedia Computing and Systems Proceedings. Washinton:Computer Science,1999,747-751
    [18]吴洪,卢汉清,马颂德.基于内容图像检索中相关反馈技术的回顾.计算机学报.2005,28(12)：1969-1979
    [19]Wu Yimin, Zhang Aidong. A feature re-weighting approach for relevance feedback in image retrieval. In:International Conference on Image Processing, Proceedings, Washinton:Computer Science,2002,581-684
    [20]Ritendra Datta,Dhiraj Joshi, Jia Li and James Z. Wang. Image Retrieval:Ideas, Influences, and Trends of the New Age. Washinton:ACM Computing Surveys, 2008,40(2):1-60.
    [21]M.W. Cannon S.C. Fullenkamp. A Model for Inhibitory Lat-eral Interaction Effects in Perceived Contrast. Vision Res.,1996,36(8):1115-1125
    [22]Tinku Acbarya, Ajoy K.Ray.田浩,葛秀慧,王顶等译.数字图像处理：原理与应用.北京：清华大学出版社,2007
    [23]Borchani,M., Stamon,G. Texture features for image classification and tetrieval. Proceedings of the SPIE-The International Society for Optical Engineering, SMC.1997,3229(6):401-406
    [24]边肇祺,张学工等.模式识别.北京：清华大学出版社,2000
    [25]高升.基于颜色特征的图像检索方法及系统实现：[硕士学位论文].北京：北京邮电大学,2007
    [26]Y. Chen, J. Z. Wang, R. Krovetz. CLUE:Cluster-based retrieval of images by unsupervised learning. IEEE Trans. On Image Processing,2005,14(8):1187-1201
    [27]Y. Ke, R. Sukthankar. PCA-SIFT:a more distinctive representation for local image descriptors. IEEE Computer Vision and Pattern Recognition or CVPR. 2004,2(6):506-513
    [28]Amold W.M., Marcel W., Smimone S., et al, Content-based image retrieval at the end of early years, IEEE Trans On PAMI,2000,22(12):1349-1379
    [29]D. L. Pham and J. LPrince. An Adaptive Fuzzy C-means Algorithm for Image Segmentation in the Presence of Intensity In homogeneities. Pattern Recognition Letters.1999,20(1):57-68
    [30]Methre B.M., Kankallllalli M. S., Narasimhalu A. D. Color matching for image retrieval. Pattern recognition Letters,1995,16(3):325-331.
    [31]Hsieh I. S, Fan K C. Multiple classifiers for color flag and tradermark image retrieval. IEEE Trans. On Image Processing,2001,10(6):938-950.
    [32]C. S. Fuh, S.W. Cho and K. Essig. Hierarchical Color Image Region Segmentation for Content-Based Image Retrieval System, IEEE Transactions on Image Processing,2000,9(1):156-162
    [33]J. He, M. Li, H. J. Zhang, H. Tong, et al. Manifold-ranking based image retrieval, in Proc.12th ACM Conference on Multimedia.2004,25(5):9-16
    [34]Srihari R K, Zhang Z and Rao A. Intelligent indexing and semantic retrieval of multimodal documents. Information Retrieval,2000,2(2):1-37
    [35]STENTIFORD F. An attention based similarity measure with application to content-based information retrieval, In Proceedings of the Storage and Retrieval for Media Databases Conference, SPIE Electronic Imaging.Santa Clara, CA:SPIE Press,2003,221-232
    [36]Chen L, Xie X, Fan X, et al. A visual attention model for adapting images on small displays, Multimedia System,2003,9(4):353-364.
    [37]Casello. J.M.,Wright R. M.,Vuchic V.R. Context-sensitive urban transportation design in West Philadelphia, Pennsylvania, Transportation Research Record,2006,1956(10):165-174
    [38]L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans, on PAMI,1998,20(11):1254-1259
    [39]UMESH R. Probability models for visual search. IEEE Multidimensional Digital Signal Processing,2000,10(5):1321-1328
    [40]Itti L, Gold C, Koch C. Visual attention and target detection in cluttered natural scenes. Optical Engineering,2001,40(9):1784-1793.
    [41]A Bamidele, F W M Stentiford and J Morphett. An attention-based approach to content-based image retrieval, BT Technology Journal,2004,22(3):151-160
    [42]Courty N, Marchand E. Visual perception based on salient features. IEEE/ RSJ,2003,1(10):1024-1029.
    [43]张菁,沈兰荪,David Dagan Feng.基于视觉感知的图像检索的研究.电子学报,36(3)：223-227
    [44]Marques O, Mayron L M, Borba G B, et al. Using visual attention to extract regions of interest in the context of image retrieval. In Proceedings of the ACM SE'06,2006,6(4):638-643
    [45]Marquee O, Mayron L M, Borba G B, et al. An Attention-Driven Model for Grouping Similar Images with Image Retrieval Applications. Eurasip Journal on Advances in Signal Processing,2007,2007(1):1-17
    [46]Lowe D G Distinctive Image Features from Scale-invariant Keypoints. International Journal of Computer Vision,2004,60(2):91-110
    [47]M. Brown, D. G Lowe. Recognition Panoramas, Proceedings of the Ninth IEEE International Conference on Computer Vision,2003,2(5):1218-1225.
    [48]Lowe D,G. Local Feature View Clustering for 3D Object Recognition. IEEE Conference on Computer Vision and Pattern Recognition.2001, 1(8):682-688
    [49]赵辉.基于点特征的图像配准算法研究.[硕士学位论文].山东：山东大学,2007
    [50]袁玲.结合显著性分析和半监督学习的图像检索算法：[硕士学位论文].北京：北京交通大学,2009
    [51]J. Machrouh, P. Tarroux. Attentional mechanisms for interactive image exploration, EURASIP Journal on Applied Signal Processing,2005,2005(14) 2391-2396
    [52]F. Stentiford. An estimator for visual attention through competitive novelty with application to image compression. Picture Coding Symposium,2001,8(5):25-27
    [53]Munshi D.,Kilbings M. Principal component analysis of weak lensing surveys. Astronomy & Astrophysics,2006,452(1):62-73
    [54]Y. Ke and R. Sukthankar, Pca-sift:a more distinctive representation for local image descriptors. Computer Vision and Pattern Recognition,2004,2(5):506-513
    [55]樊昀,王润生.基于内容的图像检索系统的一些关键技术研究.电子学报,2003,14(6)：201-206
    [56]张健沛,闫锐,杨静.基于感兴趣区域的图像检索方法的研究.哈尔滨工程大学学报,2003,24(3)：423-427
    [57]李强,赵伟.MATLAB数据处理与应用.北京：国防工业出版社,2001.12-17
    [58]Ke Gao, Shouxun Lin, Yongdong Zhang, et al. Attention Model Based SIFT Keypoints Filtration for Image Retrieval. Seventh IEEE/ACIS International Conference on Computer and Information Science,2008,40(2):191-196
    [59]Y. Deng, B. S. Manjunath, C. Kenney, et al. An Efficient Color Representation for Image Retrieval, IEEE Transactions on Image Processing,2001,10(1):140-147
    [60]R.J. Peters, A. Iyer, L. Itti, C. Koch. Componentsof bottom-up gaze allocation in natural images. Vision Research,2005,45(18):2397-2416.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700