用户名: 密码: 验证码:
基于深度学习的单目视觉深度估计研究综述
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Review of Monocular Depth Estimation Based on Deep Learning
  • 作者:郭继峰 ; 白成超 ; 郭爽
  • 英文作者:GUO Jifeng;BAI Chengchao;GUO Shuang;School of Astronautics,Harbin Institute of Technology;
  • 关键词:深度估计 ; 深度学习 ; 单目视觉 ; 监督/无监督学习 ; 随机条件场 ; 语义信息
  • 英文关键词:Depth Estimation;;Depth Learning;;Monocular Vision;;Supervised/Unsupervised Learning;;Stochastic Conditional Field;;Semantic Information
  • 中文刊名:UMST
  • 英文刊名:Unmanned Systems Technology
  • 机构:哈尔滨工业大学航天学院;
  • 出版日期:2019-04-15
  • 出版单位:无人系统技术
  • 年:2019
  • 期:v.2
  • 基金:国家自然科学基金(11472090)
  • 语种:中文;
  • 页:UMST201902003
  • 页数:10
  • CN:02
  • ISSN:10-1565/TJ
  • 分类号:16-25
摘要
随着智能化程度的不断提升,对深度估计的要求也越来越高,激光雷达和立体视觉被广泛应用,取得了不错的效果。但受限于传感单元重量、体积以及成本等因素,逐渐涌现出一种新的研究思路,即仅利用成本低廉的单目视觉实现对深度信息的精确测量。首先分析了现有深度信息提取方式的特点及缺陷,给出了单目深度估计的研究意义;其次对近年来基于深度学习进行单目深度估计的方法进行了分类及特点分析,包括监督学习、无监督学习、半监督学习、基于条件随机场(CRF)的方法、联合语义分割、引入其他信息辅助深度估计的方法;最后对此领域的未来发展趋势做出了简要分析。
        With the continuous improvement of intelligence,the requirement of depth estimation is becoming higher and higher. Lidar and stereo vision are widely used,and good results have been achieved. However,limited by the weight,volume and cost of the sensor unit,a new research idea has emerged gradually,that is,precise measurement of depth information using only low-cost monocular vision. Firstly,the characteristics and shortcomings of the existing methods of extracting depth information are analyzed,and the research significance of monocular depth estimation is given. Secondly,the methods of monocular depth estimation based on deep learning in recent years are classified and analyzed,including supervised learning,unsupervised learning,semi-supervised learning,conditional random field(CRF)based method,joint semantics segmentation,and information-aided depth estimation method. Finally,the future development trend of this field is briefly analyzed.
引文
[1]David E,Christian P,Rob F. Depth map prediction from a single image using a multi-scale deep network[C]. The 28th Conference on Neural Information Processing Systems,Montréal,Canada,December 8-13,2014.
    [2]David E,Rob F. Predicting depth,surface normals and semantic labels with a common multi-scale convolutional architecture[C]. The 15th International Conference on Computer Vision,Santiago,Chile,December 13-16,2015.
    [3]Jun L,Reinhard K,Angela Y. A two-streamed network for estimating fine-scaled depth maps from single RGB images[C]. The 16th International Conference on Computer Vision,Santiago,Chile,October 22-29,2017.
    [4]Iro L,Christian R,Vasileios B. Deeper depth prediction with fully convolutional residual networks[C]. The 4th International Conference on 3D Vision,CA,USA,October 25-28,2016.
    [5]Keisuke T,Federico T ,Iro L,et al. CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction[C]. The 30th IEEE Conference on Computer Vision and Pattern Recognition,Hawaii,USA,July 22-25,2017.
    [6]Daniel Z,Phillip I,Dilip K,et al. Learning ordinal relationships for mid-level vision [C]. The 15th International Conference on Computer Vision,Santiago,Chile,December 13-16,2015.
    [7]Chen W F,Fu Z,Yang D W,et al. Single-image depth perception in the wild[C]. The 30th Conference on Neural Information Processing Systems,Barcelona,Spain,December 5-10,2016.
    [8]Zhou T H,Matthew B,Noah S,et al. Unsupervised learning of depth and ego-motion from video[C]. The 30th IEEE Conference on Computer Vision and Pattern Recognition,Hawaii,USA,July 22-25,2017.
    [9]Yin Z C,Shi J P. GeoNet: unsupervised learning of dense depth,optical flow and camera pose [C]. The 31th IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,USA,June 19-21,2018.
    [10]Reza M,Martin W,Anelia A. Unsupervised learning of depth and ego-motion from monocular video using 3D geometric con-straints[C]. The 31th IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,USA,June 19-21,2018.
    [11]Clement G,Oisin M A,Gabriel J B. Unsupervised monocular depth estimation with left-right consistency[C]. The 29th IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,USA,June 19-21,2016.
    [12]Zhang Y D,Ravi G,Chamara S W,et al. Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction[C]. The 31th IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,USA,June 19-21,2018.
    [13]Yevhen K,Jorg S,Bastian L. Semi-supervised deep learning for monocular depth map prediction[C]. The 30th IEEE Conference on Computer Vision and Pattern Recognition,Hawaii,USA,July 22-25,2017.
    [14]Yang N,Wang R,Jorg S,et al. Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry[C]. The 15th European Conference on Computer Vision Munich,Germany,September 8-14,2018.
    [15]Tulyakov S,Ivanov A,Fleuret F. Semi-supervised learning of deep metrics for stereo reconstruction[J]. arXiv Preprint,2016.
    [16]Stepan T,Anton I. Weakly supervised learning of deep metrics for stereo reconstruction[C]. The 16th International Conference on Computer Vision,Santiago,Chile,October 22-29,2017.
    [17]Liu F Y,Shen C H,Lin G S. Deep convolutional neural fields for depth estimation from a single image[C]. The 28th IEEE Conference on Computer Vision and Pattern Recognition,Boston,USA,June 8-10,2015.
    [18]Xu D,Elisa R,Ouyang W L,et al. Multi-scale continuous CRFs as sequential deep networks for monocular depth estimation[C]. The 30th IEEE Conference on Computer Vision and Pattern Recognition,Hawaii,USA,July 22-25,2017.
    [19]Arsalan M,Hamed P,Jana K. Joint semantic segmentation and depth estimation with deep convolutional etworks[C]. The 4th International Conference on 3D Vision,CA,USA,October 25-28,2016.
    [20]Zhang Z Y,Alexander G S,Sanja F,et al. Monocular object instance segmentation and depth ordering with CNNs[C]. The 15th International Conference on Computer Vision,Santiago,Chile,December 13-16,2015.
    [21]Liu B Y,Stephen G,Stephen G. Single image depth estimation from predicted semantic labels[C]. The 23th IEEE Conference on Computer Vision and Pattern Recognition,San Francisco,USA,June 13-18,2010.
    [22]Wang P,Shen X H,Lin Z,et al. Towards unified depth and semantic prediction from a single image[C]. The 28th IEEE Conference on Computer Vision and Pattern Recognition,Boston,USA,June 8-10,2015.
    [23]Pratul P S,Rahul G,Neal W,et al. Aperture supervision for monocular depth estimation[C]. The 31th IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,USA,June 19-21,2018.
    [24]Qi X J,Liao R J,Liu Z Z,et al. GeoNet: Geometric neural network for joint depth and surface normal estimation[C]. The 31th IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,USA,June 19-21,2018.
    [25]Zhang Y D,Thomas F. Deep depth completion of a single RGB-D image[C]. The 31th IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,USA,June 19-21,2018.
    [26]Wang P,Shen X H,Bryan R,et al. SURGE: Surface regularized geometry estimation from a single image[C]. The 30th Conference on Neural Information Processing Systems,Barcelona,Spain,December 5-10,2016.
    [27]Li B,Dai Y,Chen H,et al. Single image depth estimation by dilated deep residual convolutional neural network and soft-weight-sum inference[J]. arXiv Preprint ,2017.
    [28]Cao Y Z,Wu Z F,Shen C H. Estimating depth from monocular images as classification using deep fully convolutional residual networks [J]. IEEE Transactions on Circuits and Systems for Video Technology,2018,28 (11): 3174 - 3182.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700