用户名: 密码: 验证码:
基于改进空洞卷积神经网络的丘陵山区田间道路场景识别
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Field road scene recognition in hilly regions based on improved dilated convolutional networks
  • 作者:李云伍 ; 徐俊杰 ; 刘得雄 ; 于尧
  • 英文作者:Li Yunwu;Xu Junjie;Liu Dexiong;Yu Yao;College of Engineering and Technology, Southwest University;Chongqing Key Laboratory of Agricultural Equipment for Hilly and Mountainous Regions;Guizhou Mountain Agricultural Machinery Research Institute;
  • 关键词:农机 ; 导航 ; 机器视觉 ; 田间道路 ; 场景识别 ; 语义分割 ; 空洞卷积神经网络
  • 英文关键词:agricultural machinery;;navigation;;machine vision;;field roads;;scene recognition;;semantic segmentation;;dilated convolutional networks
  • 中文刊名:NYGU
  • 英文刊名:Transactions of the Chinese Society of Agricultural Engineering
  • 机构:西南大学工程技术学院;丘陵山区农业装备重庆市重点实验室;贵州省山地农业机械研究所;
  • 出版日期:2019-04-08
  • 出版单位:农业工程学报
  • 年:2019
  • 期:v.35;No.359
  • 基金:贵州省科技支撑计划项目([2019]2384)
  • 语种:中文;
  • 页:NYGU201907019
  • 页数:10
  • CN:07
  • ISSN:11-2047/S
  • 分类号:158-167
摘要
基于机器视觉的自主导航是智能农业机械的主要导航方式之一。丘陵山区复杂的田间道路场景,使得智能农机在田间道路上的自主导航与避障存在较大的困难。该文根据丘陵山区田间道路图像特征,将田间道路场景对象分为背景、道路、行人、植被、天空、建筑、牲畜、障碍、水塘、土壤和杆等11类,构建了基于空洞卷积神经网络的田间道路场景图像语义分割模型。该模型包括前端模块和上下文模块,前端模块为VGG-16融合空洞卷积的改进结构,上下文模块为不同膨胀系数空洞卷积层的级联,采用两阶段训练方法进行训练。利用CAFFE深度学习框架将改进的网络模型与经典的FCN-8s网络模型进行了对比测试,并进行了道路阴影的适应性测试。语义分割测试结果表明:Front-end+Large网络的统计像素准确率、类别平均准确率以及平均区域重合度都最高,而FCN-8s网络最低;Front-end+Large网络在无阴影道路训练集和有阴影道路训练集上的平均区域重合度分别为73.4%和73.2%,对阴影干扰有良好的适应性。该文实现了丘陵山区田间道路场景像素级的预测,能为智能农业机械在田间道路上基于机器视觉的自主导航和避障奠定基础。
        Accurate acquisition of drivable area and obstacle information on field road is an important research for automatic navigation of intelligent agricultural machinery based on machine vision. In order to accurately identify field roads and its surrounding environment, an image semantic segmentation model of field road scene was proposed based on DCN(dilated convolutional neural network). Field roads in hilly regions are often twisted, windy and rolling, and occluded by different types of crops along both sides and many kinds of obstacles on the roads. Based on the analysis of image features of field roads in hilly regions, the field road scenes were divided into 11 categories in this paper: Background, road, pedestrian, vegetation, sky, construction, livestocks, obstacle, pond, soil and pole. Based on a traditional FCN(fully convolutional neural network) of VGG-16 structure, the front-end module and context aggregation module in DCN were put forward by removing the part that wasn't conducive to pixel prediction and restructuring a higher prediction-accuracy front-end module. The front-end module was improved based on the VGG-16. The pooling 4 and pooling 5 layers in VGG-16 were removed, and the three convolutions in Conv-5 were replaced by dilated convolution with expansion coefficient of 2, and the convolution Fc6 layer was changed to dilated convolution with an expansion coefficient of 4 to keep the receptive field unchanged. At the same time, the padding operation in the VGG-16 was deleted. The context module was a cascade of void convolution layers with different expansion coefficients and the first six layers were dilated convolutions with expansion coefficients of 1, 1, 2, 4, 8 and 16, respectively. Also two context module structure, namely Basic and Large, were proposed. The parameters of the constructed DCN could be initialized using the traditional VGG-16 network and produced higher resolution output. Then the two-stage training method was adopted to solve the problems of long training time and difficult convergence. In CAFFE(convolutional architecture for fast feature embedding) deep learning framework, the improved network models were constructed and compared with the classical FCN-8 s network model. The FCN-8 s network model, the network model constructed only with front-end module, and that with both the front-end module and context module(Basic and Large structure were used respectively) network model were tested. The adaptability of the network model constructed with both Front-end and context module with Large network to shadowed road images was verified better, namely the evaluation index of PA(statistical pixel accuracy), MPA(category average accuracy) and MIoU(mean intersection over union) of which were the highest. FCN-8 s network model were the lowest in the evaluation index. Then the network model constructed with both Front-end and context module with Large network was used as the semantic segmentation model for field road scene. Its MIoU was 73.4% in unshadowed road test dataset and it was only decreased by 0.2 percentage points in shadowed road test dataset. Moreover, the PA and MPA in unshadowed road test dataset and shadowed road test dataset were almost the same, respectively. The results showed that the improved model in this paper had good adaptability to the shadow disturbance of field road scenes in hilly regions. The proposed model has good generalization and robustness, which realizes the prediction of pixel level of field road image in hilly regions, and provides basic support for the autonomous navigation and obstacle avoidance of agricultural machines on field roads.
引文
[1]李云伍,徐俊杰,王铭枫,等.丘陵山区田间道路自主行驶转运车及其视觉导航系统研制[J].农业工程学报,2019,35(1):52-61.Li Yunwu,Xu Junjie,Wang Mingfeng,et al.Development of autonomous driving transfer trolley on field roads and its visual navigation system for hilly areas[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2019,35(1):52-61.(in Chinese with English abstract)
    [2]吴宗胜.室外移动机器人的道路场景识别及路径规划研究[D].西安:西安理工大学,2017.Wu Zongsheng.Research for the Road Scene Recognition and Path Planning for the Outdoor Mobile Robots[D].Xi’an:Xi’an University of Technology,2017.(in Chinese with English abstract)
    [3]张新钰,高洪波,赵建辉,等.基于深度学习的自动驾驶技术综述[J].清华大学学报:自然科学版,2018,58(4):438-444.Zhang Xinyu,Gao Hongbo,Zhao Jianhui,et al.Overview of deep learning intelligent driving methods[J].J Tsinghua Univ:Sci&Technol,2018,58(4):438-444.(in Chinese with English abstract)
    [4]Oliveira G L,Burgard W,Brox T.Efficient deep models for monocular road segmentation[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems.Daejeon,South Korea,2016:4885-4891.
    [5]Coombes M,Eaton W H,Chen W H.Colour based semantic image segmentation and classification for unmanned ground operations[C]//International Conference on Unmanned Aircraft Systems(ICUAS).Arlington,VA USA,2016:858-867.
    [6]Wang J,Kim J.Semantic segmentation of urban scenes with a location prior map using lidar measurements[C]//IEEE/RSJInternational Conference on Intelligent Robots and Systems(IROS).Vancouver,BC,Canada,2017:661-666.
    [7]Cordts M,Rehfeld T,Schneider L,et al.The stixel world:Amedium-level representation of traffic scenes[J].Image and Vision Computing,2017,68:40-52.
    [8]Zhang Z,Xu C,Yang J,et al.Deep hierarchical guidance and regularization learning for end-to-end depth estimation[J].Pattern Recognition,2018,83:430-442.
    [9]Chen J,Xu W,Peng W,et al.Road object detection using a disparity-based fusion model[J].IEEE Access,2018,6:19654-19663.
    [10]轩永仓.基于全卷积神经网络的大田复杂场景图像的语义分割研究[D].杨凌:西北农林科技大学,2017.Xuan Yongcang.Research on the Semantic Segmentation of Complex Scene Image of Field Based on Fully Convolutional Networks[D].Yangling:Northwest A&F University,2017.(in Chinese with English abstract)
    [11]张利刚.基于全空洞卷积神经网络的图像语义分割[D].长春:东北师范大学,2018.Zhang Ligang.Fully Dilated Convolutional Networks for Semantic Segmentation[D].Changchun:Northeast Normal University,2018.(in Chinese with English abstract).
    [12]李迎春,付兴建,薛琴.基于RGB熵的非结构化道路分割方法[J].计算机工程与设计,2017,9(6):1570-1574.Li Yingchun,Fu Xingjian,Xue Qin.Unstructured road segmentation method based on RGB entropy[J].Computer Engineering and Design,2017,9(6):1570-1574.(in Chinese with English abstract)
    [13]王新晴,孟凡杰,吕高旺,等.基于PCA-SVM准则改进区域生长的非结构化道路识别[J].计算机应用,2017,37(6):1782-1786.Wang Xinqing,Meng Fanjie,LüGaowang,et al.Unstructured road detection based on improved region growing with PCA-SVM rule[J].Journal of Computer Applications,2017,37(6):1782-1786.(in Chinese with English abstract)
    [14]王小娟,李云伍,刘得雄,等.基于机器视觉的丘陵山区田间道路虚拟中线提取方法[J].西南大学学报:自然科学版,2018,40(4):162-169.Wang Xiaojuan,Li Yunwu,Liu Dexiong,et al.A machine vision-based method for detecting virtual midline of field roads in the hilly areas[J].Journal of Southwest University:Natural Science Edition,2018,40(4):162-169.(in Chinese with English abstract)
    [15]Cha Y J,Choi W,Büyük?ztürk O.Deep learning-based crack damage detection using convolutional neural networks[J].Computer-Aided Civil and Infrastructure Engineering,2017,32(5):361-378.
    [16]Long J,Shelhamer E,Darrell T.Fully convolutional networks for semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition.Boston,MA,USA,2015:3431-3440.
    [17]Shensa,Mark J.The discrete wavelet transform:Wedding the a trous and Mallat algorithms[J].IEEE Transactionson on Signal Processing,1992,40(10):2464-2482.
    [18]Holschneider M,Kronland-Martinet R,Morlet J,et al.Areal-time algorithm for signal analysis with the help of the wavelet transform[C]//Wavelets.Time-Frequency Methods and Phase Space,1989:286-297.
    [19]张华博.基于深度学习的图像分割研究与应用[D].成都:电子科技大学,2018.Zhang Huabo.Research and Application of Image Segmentation by Deep Learning[D].Chengdu:University of Electronic Science and Technology of China,2018.(in Chinese with English abstract)
    [20]Chen L C,Yang Y,Wang J,et al.Attention to scale:scale-aware semantic image segmentation[C]//IEEEConference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA,2016:3640-3649.
    [21]Yu F,Koltun V.Multi-scale context aggregation by dilated convolutions[C]//International Conference on Learning Representations(ICLR).San Juan,Puerto Rico,2016:1-13.
    [22]He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA,2016:770-778.
    [23]Sudholt S,Fink G A.PHOCNet:A deep convolutional neural Network for word spotting in handwritten documents[C]//International Conference on Frontiers in Handwriting Recognition(ICFHR).Shenzhen,China,2016:277-282.
    [24]Chen L C,Papandreou G,Kokkinos I,et al.DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEETransactions on Pattern Analysis and Machine Intelligence,2018,40(4):834-848.
    [25]Inoue M,Inoue S,Nishida T.Deep recurrent neural network for mobile human activity recognition with high throughput[J].Artificial Life&Robotics,2018,23(2):173-185.
    [26]刘立波,程晓龙,赖军臣.基于改进全卷积网络的棉田冠层图像分割方法[J].农业工程学报,2018,34(12):193-201.Liu Libo,Cheng Xiaolong,Lai Juncheng.Segmentation method for cotton canopy image based on improved fully convolutional network model[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2018,34(12):193-201.(in Chinese with English abstract)
    [27]Jia Y,Shelhamer E,Donahue J,et al.Caffe:Convolutional architecture for fast feature embedding[C]//ACMInternational Conference on Multimedia.New York,NY,USA,2014:675-678.
    [28]Bengio S,Bengio Y,Courville A,et al.Why does unsupervised pre-training help deep learning?[J]Journal of Machine Learning Research,2010,11(2):625-660.
    [29]Liu W,Anguelov D,Erhan D,et al.SSD:Single shot multibox detector[C]//European Conference on Computer Vision.Springer International Publishing,Amsterdam,The Netherlands,2016:21-37.
    [30]Ouyang W,Zeng X,Wang X,et al.DeepID-Net:Object detection with deformable part based convolutional neural networks[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2017,39(7):1320-1334.
    [31]陈鸿翔.基于卷积神经网络的图像语义分割[D].杭州:浙江大学,2016.Chen Hongxiang.Semantic Segmentation Based on Convolutional Neural Networks[D].Hangzhou:Zhejiang University,2016.(in Chinese with English abstract).
    [32]杨阿庆,薛月菊,黄华盛,等.基于全卷积网络的哺乳母猪图像分割[J].农业工程学报,2017,33(23):219-225.Yang Aqing,Xue Yueju,Huang Huasheng,et al.Lactating sow image segmentation based on fully convolutional networks[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2017,33(23):219-225.(in Chinese with English abstract)

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700