用户名: 密码: 验证码:
一种基于强化学习的限定代价下卷积神经网结构自动化设计方法(英文)
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Automatically Design Cost-Constrained Convolutional Neural Network Architectures with Reinforcement Learning
  • 作者:许强 ; 徐杨杰 ; 姜玉林 ; 张涌
  • 英文作者:XU Qiang;XU Yangjie;JIANG Yulin;ZHANG Yong;Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences;University of Chinese Academy of Sciences;
  • 关键词:深度学习 ; 强化学习 ; 卷积神经网 ; 网络结构搜索 ; 代价优化
  • 英文关键词:deep learning;;reinforcement learning;;convolutional neural network;;neural architecture search;;cost optimization
  • 中文刊名:集成技术
  • 英文刊名:Journal of Integration Technology
  • 机构:中国科学院深圳先进技术研究院;中国科学院大学;
  • 出版日期:2019-04-02 18:20
  • 出版单位:集成技术
  • 年:2019
  • 期:03
  • 基金:国家自然科学基金重点项目(61433012);; 科技部重点研发计划项目(2018YFB0204005)
  • 语种:英文;
  • 页:44-56
  • 页数:13
  • CN:44-1691/T
  • ISSN:2095-3135
  • 分类号:TP183;R318
摘要
目前的神经网络结构自动化设计方法主要对所设计神经网络结构的预测准确率进行优化。然而,实际应用中经常要求所设计的神经网络结构满足特定的代价约束,如内存占用、推断时间和训练时间等。该文提出了一种新的限定代价下的神经网络结构自动化设计方法,选取内存占用、推断时间和训练时间三类代表性代价在CIFAR10数据集上进行了实验,并与现有方法进行了对比分析。该方法获得了满足特定代价约束的高准确率的卷积神经网络结构,可优化的代价种类比现有方法更多。
        Recently, automated neural network architecture design(neural architecture search) has yielded many significant achievements. Improving the prediction accuracy of the neural network is the primary goal.However, besides the prediction accuracy, other types of cost including memory consumption, inference time, and training time are also very important when implementing the neural network. In practice, such types of cost are often bounded by thresholds. Current neural architecture search method with budgeted cost constraints can only optimize some specific types of the cost. In this paper, we propose budgeted efficient neural architecture search(B-ENAS) to optimize more types of cost. The experimental results on the well-adopted CIFAR 10 dataset show that B-ENAS can learn convolutional neural network architectures with high accuracy under different cost constraints.
引文
[1]Zoph B,Le QV.Neural architecture search with reinforcement learning[C]//The International Conference on Learning Representations(ICLR)2017.
    [2]Zoph B,Vasudevan V,Shlens J,et al.Learning transferable architectures for scalable image recognition[C]//The IEEE Conference on Computer Vision and Pattern Recognition(CVPR)2018.
    [3]Pham H,Guan MY,Zoph B,et al.Efficient neural architecture search via parameter sharing[C]//The International Conference on Machine Learning(ICML),2018.
    [4]Real E,Moore S,Selle A,et al.Large-scale evolution of image class ifiers[C]//The International Conference on Machine Learning(ICML),2017.
    [5]Real E,Aggarwal A,Huang YP,et al.Regularized evolution for image classifier architecture search[C]//The Thirty-Third AAAI Conference on Artificial Intelligence,2018.
    [6]Suganuma M,Shirakawa S,Nagao T.A genetic programming approach to designing convolutional neural network architectures[C]//The Genetic and Evolutionary Computation Conference(GECCO),2017:497-504.
    [7]Veniat T,Denoyer L.Learning time/memoryefficient deep architectures with budgeted super networks[C]//The IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2018:3492-3500.
    [8]Baker B,Gupta O,Naik N,et al.Designing neural network architectures using reinforcement learning[C]//The International Conference on Learning Representations(ICLR),2017.
    [9]Baker B,Gupta O,Raskar R,et al.Accelerating neural architecture search using performance prediction[C]//The Conference on Neural Information Processing Systems(NIPS),2017.
    [10]Brock A,Lim T,Ritchie JM,et al.SMASH:one-shot model architecture search through Hyper Net work[J].ar Xivprepr in tar Xiv:1708.05344,2017.
    [11]Jin HF,Song QQ,Hu X,et al.Neural architecture search with net work morphism[J].ar Xiv:1806.10282v2,2018.
    [12]Cai H,Chen TY,Zhang WN,et al.Efficient architecture search by network transformation[C]//The Association for the Advancement of Artificial Intelligence(AAAI),2018.
    [13]Cai H,Yang JC,Zhang WN,et al.Path-level network transformation for efficient architecture search[C]//The International Conference on Machine Learning(ICML),2018.
    [14]Zhong Z,Yan JJ,Wu W,et al.Practical block-wise neural network architecture generation[C]//The IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2018:2423-2332.
    [15]Liu HX,Simonyan K,Vinyals O,et al.Progressive neural architecture search[C]//European Conference on Computer Vision(ECCV),2018.
    [16]Saxena S,Verbeek J.Convolutional neural fabrics[C]//The Conference on Neural Information Processing Systems(NIPS),2016.
    [17]Negrinho R,Gordon G.Deep Architec t:automatically designing and training deep architectures[J].arXiv:1704.08792,2017.
    [18]Liu CX,Zoph B,Neumann M,et al.Progressive neural architecture search[C]//The European Conference on Computer Vision(ECCV),2018:19-34
    [19]Elsken T,Metzen JH,Hutter F.Simple and efficient architecture search for convolutional neural networks[C]//The International Conference on Learning Representations(ICLR),2017.
    [20]Liu HX,Simony an K,Yang YM.D A RT S:differentiable architecture search[J].arXiv:1806.09055,2018.
    [21]Dong XY,Huang JS,Yang Y,et al.More is less:a more complicated network with less inference complexity[C]//The IEEE Conference on Computer Vision and Pattern Recognition(CVPR)2017:5840-5848.
    [22]Huang G,Chen DL,Li TH,et al.Multi-scale dense convolutional networks for resource efficient image classification[C]//The International Conference on Learning Representations(ICLR),2018.
    [23]Hassibi B,Stork DG.Second order derivatives for network pruning:optimal brain surgeon[C]//The Conference on Neural Information Processing Systems(NIPS),1993.
    [24]Vanhoucke V,Senior A,Mao MZ.Improving the speed of neural networks on CPUs[C]//The Conference on Neural Information Processing Systems(NIPS),2011.
    [25]Han S,Mao HZ,Dally WJ.Deep compression:compressing deep neural network with pruning trained quantization and huffman coding[C]//The International Conferenceon Learning Representations(ICLR),2015.
    [26]Williams RJ,Peng J.Function optimization using connectionist reinforcement learning algorithms[J]Connection Science,1991,3(3):241-268.
    [27]Williams RJ.Simple statistical gradient-following algorithms for connectionist reinforcement learning[J].Machine Learning,1992,8(3-4):229-256.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700