用户名: 密码: 验证码:
(?)_p正则化问题的算法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
(?)_p正则化问题在变量选择、信号处理、压缩传感、数据挖掘、金融最优化等许多领域有广泛的应用背景.对该问题的理论与算法的研究是目前国际优化界关注的一个热点.本文研究求解(?)_p正则化问题的理论、算法及其在压缩传感等领域中的应用.侧重于对p∈(0,1]时问题的研究.当p=1时,问题是一个Lipschitz连续的非光滑最优化问题.当p∈(0,1)时,问题是一个非Lipschitz连续的非光滑非凸函数极小化问题.是一个难度较大的问题.
     在数值算法方面,本文集中于对求解大规模问题数值算法的研究.首先研究求解大规模1正则化问题的数值算法.我们从两个不同的途径研究求解该问题的数值算法.一方面,将求解光滑大规模最优化问题颇受欢迎的谱梯度算法加以改进,利用1正则化问题中目标函数的广义梯度,我们提出一种求解1正则化问题的谱梯度算法,并建立算法的收敛性定理.另一方面,对2-1正则化问题,我们利用问题的特殊性质,将其转化为一个等价的非光滑方程组.该方程组是一个单调、Lipschitz连续的半光滑方程组.在此基础上,我们提出一种求解2-1正则化问题的一种具有低存储量的迭代法.该算法的一个重要优点是算法产出的点列到问题的解集合的距离单调递减,而且,整个序列收敛于问题的一个解.为测试本文所提出的算法的使用效果,我们将算法应用于求解稀疏信号重建和图像恢复等问题,并与求解1正则化问题已有的数值表现较好的算法如GPSR和SpaRSA等算法进行比较,结果表明,本文的算法具有更好的实用效果和数值表现.
     在对(?)_p(p∈(0,1))正则化问题的数值算法研究方面,本文侧重于对求解(?)_(1/2)正则化问题数值算法的研究.这主要是由于大量数值结果表明(?)_(1/2)正则化问题是(?)_p(p∈(0,1))正则化问题中具有代表性的问题.我们首先将谱梯度算法的思想应用于求解该问题,提出了求解(?)_(1/2)正则化问题的一种拟谱梯度算法.值得一提的是,本文的拟谱梯度算法不是求解光滑最优化问题谱梯度算法的一种简单推广工作,而是一种改进.由于1和(?)_(1/2)正则化问题是不可微问题,特别,(?)_(1/2)正则化问题的目标函数非Lipschitz连续,其广义梯度不存在,为此,我们构造一个函数使得它在问题中的地位与梯度在光滑最优化问题中的地位相似.在此基础上采用谱梯度法的思想设计算法.我们在适当条件下,证明算法的全局收敛性.
     本文最重要的一个贡献是导出了求解(?)_(1/2)正则化问题的一个等价的光滑约束最优化问题.该约束问题极小化一个光滑函数,可行域由简单二次不等式约束和非负约束构成.问题的可行域具有非空内部,其KKT点一定存在,而且KKT点与原问题的稳定点相同.该等价性模型的建立为求解(?)_(1/2)正则化问题的数值算法开辟了一条新途径,使得应用求解光滑约束最优化问题的好的数值算法求解(?)_(1/2)正则化问题成为可能.在此等价性模型的基础上,我们提出求解(?)_(1/2)正则化问题的一种可行最速下降方向算法.该算法中的可行最速下降方向具有显式表达形式,因而,算法具有存储量少、计算量低的优点.在较弱的条件下,我们建立算法的全局收敛性定理.我们的数值试验结果表明,所提出的的可行最速下降法具有很好的数值表现.
     文章还研究l_(1/2)正则化问题的最优性条件.导出问题的解的一阶、二阶必要条件和二阶充分条件.它们是已有结果的一种推广.特别,我们还证明,满足一阶条件的点不会是目标函数的一个局部极大值点.
     此博士论文用LATEX2_ε软件打印.
The lp regularization problem has wide applications in variable selection, sig-nal process, compressive sensing, data mining, financial optimization and so on. The research in lp regularization problems is a hot topic in the field of optimization. In this thesis, we study theory and numerical algorithms for solving lp regulariza-tion problems and its applications in compressive sensing. We are particularly interested in the case where p∈(0,1]. The case p=1corresponds to the l1regu-larization problem. It is a nonsmooth but Lipschitz continuous optimization prob-lem. In the case p∈(0,1), the problem is nonconvex and even non-Lipschitzian. It was proved that the lp problem with p∈(0,1) is strongly NP-hard. The study in this problem is challenging.
     Our first task is to develop numerical methods for solving the lp regularization problem with p=1and p=1/2. We focus our attention on the numerical methods for solving large scale problems. We first study numerical methods for solving large scale l1regularization problems. We develop methods in two different ways. One is to develop an iterative method for solving the problem directly. we make an improvement to the so-called spectral gradient method for solving smooth optimization problem. By the use of the generalized gradient of the objective function, we develop a quasi-spectral gradient method for solving l1regularization problem and establish its global convergence. On the other hand, we develop a nonsmooth equation based method for solving the l2-l1regularization problem. We first derive an equivalent nonsmooth equation reformulation to the problem. This equation enjoys some nice properties such as it is monotone, Lipschitz continuous and semismooth. Based on the reformulation, we propose an iterative method for solving l2-l1regularization problems. A good advantage of the proposed method is that the distance between the iterates and the solution set is decreasing. Moreover, the whole sequence converges to a solution of the problem. We apply the proposed methods to solve some practical problems arising from compressive sensing and image restoration. We compare the performance of the proposed methods with some well-known methods such as GPSR and SpaRSA methods. The results show that the proposed methods in this thesis perform better than the existing methods. They are practically efficient.
     We will pay particular attention to the numerical methods for solving1/2regularization problems. A large amount of numerical experiments have shown that the l1/2regularization problem takes a representative role in the lp regularization problem with p∈(0,1). We first extend the idea of the spectral gradient method to this problem and propose a quasi-spectral gradient method for solving1/2regularization problems. It should be pointed out that the quasi-spectral gradient method proposed in the thesis is not a simple generalization of the spectral gradient method in the solution of smooth optimization but its improvement. We notice that the l1/2regularization problem is not differentiable and even not Lipschitz continuous. Consequently, the generalized gradient in the sense of Clarke does not exist. To develop the quasi-spectral gradient method, we construct a function which takes the same role as the gradient in the case of smooth optimization. Based on this function, we develop a quasi-spectral gradient method and prove its global convergence.
     Another important contribution of the thesis is the derivation of an equivalen-t smooth constrained optimization reformulation to the l1/2regularization prob-lem. This reformulation is minimizing a smooth function subject to some simple quadratic inequality constraints and nonnegative constraints. The feasible set has a nonempty interior and the KKT point always exists. Moreover, the KKT points are the same as the stationary points of the l1/2regularization problem. The e-quivalence between the l1/2regularization problem and the smooth constrained optimization problem open a new path to develop numerical methods for solv-ing l1/2regularization problems. It makes it possible to apply numerical methods that are efficient in the solution of smooth constrained optimization to solve the l1/2regularization problem. Based on this reformulation, we propose a feasible direction method to solve the l1/2regularization problem. The feasible steepest descent direction in the method has an explicit expression and the method is easy to implement. Under mild conditions, we establish the global convergence of the proposed method. Our numerical experiments show that the proposed feasible direction method has a very good performance in computation.
     We will also study the optimality conditions for the l1/2regularization prob-lem. We derive a first order and a second order necessary conditions. We also give a second order sufficient condition for a point to be a local solution of the problem. Those conditions are extensions of the existing optimality conditions. In addition, we show that any point satisfying the first order necessary condition will not be a local maximizer of the l1/2regularization problem.
     This dissertation is typeset by software LATEX2ε.
引文
[1]刘继军.不适定问题的正则化方法及应用.北京:科学出版社,2005.
    [2] Tikhonov A N. On solving incorrectly posed problems and method of regulariza-tion. Dokl.Acad.Nauk USSR,151:1963.
    [3] Tikhonov A N, Arsenin V Y. Solutions of ill-posed problems. New York, Wileyand Sons,1977.
    [4] Bradley P S, Fayyad U M, Mangasarian O L. Mathematical programming fordata mining: formulations and challenges, Informs Journal on Computing,1999,11:217-238.
    [5] Chen S S, Donoho D L, Saunders M A. Atomic decomposition by basis pursuit.SIAM Journal on Scientifc Computing,1998,20:33-61.
    [6] Fuchs J J. On sparse representations in arbitrary redundant bases, IEEE Trans-actions on Information Theory,2004,50:1341-1344.
    [7] Mangasarian O L, Musicant D R. Large scale kernel regression via linear program-ming, Machine Learning,2002,46:255-269.
    [8] Ng A Y. Feature selection,1vs.2regularization, and rotational invariance. InProceedings of the Twenty-frst International Conference on Machine Learning,2004.
    [9] Tropp J A. Just relax: Convex programming methods for identifying sparse signals.IEEE Transactions on Information Theory,2006,51:1030-1051.
    [10] Wang L. Efcient regularized solution path algorithms with applications in ma-chine learning and data mining. Ph.D. thesis, University of Michigan,2008.
    [11] Cande`s E, Tao T. Near optimal signal recovery from random projections: universalencoding strategies. IEEE Transactions on Information Theory,2004,52:5406-5425.
    [12] Cande`s E, Romberg J, Tao T. Stable signal recovery from imcomplete and inaccu-rate information, Communications on Pure and Applied Mathematics,2005,59:1207-1233.
    [13] Cande`s E, Romberg J, Tao T. Robust uncertainty principles: exact signal recon-struction from highly incomplete frequency information. IEEE Transactions onInformation Theory,2006,52:489-509.
    [14] Donoho D L. Compressed sensing. IEEE Transactions on Information Theory,2006,52:1289-1306.
    [15] Cande`s E, Romberg J. Sparsity and incoherence in compressive sampling. InverseProblems,2007,23:969-985.
    [16] Mallat S, Zhang Z. Matching pursuits with time-frequency dictionaries. IEEETransactions on Signal Processing,1993,41:3397-3415.
    [17] Meinshausen N, Yu B. Lasso-type recovery of sparse representations for high-dimensional data. Annals of Statistics,2009,720:246-270.
    [18] Herrity K K, Gilbert A C, Tropp J A. Sparse approximation via iterative thresh-olding. In Proceedings of the Intitute Conference on Acoustics, Speech and SignalProcessing (ICASSP),2006,3:624-627.
    [19] Bredies K, Lorenz D A. Iterated hard shrinkage for minimization problems withsparsity constraints. SIAM Journal of Scientifc Computing,2008,30:657-683.
    [20] Blumensath T, Davies M E. Iterative thresholding for sparse approximations. TheJournal of Fourier Analysis and Applications,2008,14:629-654.
    [21] Tibshirani R. Regression shrinkage and selection via the Lasso. Journal of theRoyal Statistical Society: Series B,1996,58:267-288.
    [22] Donoho D L, Huo X. Uncertainty principles and ideal atomic decomposition. IEEETransactions on Information Theory,2001,47:2845-2862.
    [23] Elad M, Bruckstein A M. A generalized uncertainty principle and sparse repre-sentation in pairs of bases. IEEE Transactions on Information Theory,2002,48:2558-2567.
    [24] Gribonval R, Nielsen M. Sparse representations in unions of bases. IEEE Transac-tions on Information Theory,2003,49:3320-3325.
    [25] Tropp J A. Just relax: convex programming methods for identifying sparse signalsin noise. IEEE Transactions on Information Theory,2006,52:1030-1051.
    [26] Donoho D L, Tanner J. Counting faces of randomly-projected polytopes whenthe projection radically lowers dimension. Journal of the American MathematicalSociety,2009,22:1-53.
    [27] Clarke F H. Optimization and nonsmooth analysis. John Wiley and Sons, NewYork,(reprinted by SIAM, Philadelphia),1990.
    [28] Fu W. Penalized regressions: The bridge versus the LASSO. Journal of Computa-tional and Graphical Statistics,1998,7(3):397-416.
    [29] Shevade S, Keerthi S. A simple and efcient algorithm for gene selection usingsparse logistic regression. Bioinformatics,2003,19:2246-2253.
    [30] Perkins S, Lacker K, Theiler J. Grafting: Fast, incremental feature selection bygradient descent in function space. Journal of Machine Learning Research,2003,3:1333-1356.
    [31] Park M, Hastie T. An1regularization-path algorithm for generalized linear mod-els. Journal of the Royal Statistical Society: Series B,2007,69:659-677.
    [32] Andrew G, Gao J. Scalable training of L1-regularized log-linear models. In Pro-ceedings of the24th International Conference on Machine Learning,2007,33-40.
    [33] Lee S, Lee H, Abeel P, Ng A. Efcient L1regularized logistic regression. In Pro-ceedings of the21st National Conference on Artifcial Intelligence,2006.
    [34] Schmidt M, Fung G, Rosaless R. Optimization Methods for1Regularization.Technical report, University of British Columbia,2009.
    [35] Nikolova M, Ng M K, Zhang S, Ching W. Efent reconstruction of piecewise con-stant images using nonsmooth nonconvex minimization, SIAM Journal on ImagingSciences,2008,1:2-25.
    [36] Figueiredo M A T, Nowak R D, Wright S J. Gradient projection for sparse recon-struction: application to compressed sensing and other inverse problems, IEEEJournal of Selected Topics in Signal Processing,2007,1:586-598.
    [37] Kim S J, Koh K, Lustig M, Boyd S, Gorinevsky D. An interior-point method forlarge-scale1-regularized least squares. IEEE Journal of Selected Topics in SignalProcessing,2007,1:606-617.
    [38] Tseng P, Yun S. A coordinate gradient descent method for nonsmooth separableminimization. Mathematical Programming,2008,117:387-423.
    [39] Yun S, Toh K C. A coordinate gradient descent method for1-regularized convexminimization. Computational Optimization and Applications,2011,48:273-307.
    [40] Bai Z J, Ng M K, Qi L Q. A Coordinate Gradient Descent Method for NonsmoothNonseparable Minimization. Numerical Mathematics: Theory, Methods and Ap-plications,2009,22:377-402.
    [41] Sardy S, Bruce A, Tseng P. Block coordinate relaxation methods for nonparametricsignal denoising with wavelet dictionaries. Journal of Computational and GraphicalStatistics,2000,9:361-379.
    [42] Daubechies I, De Friese M, De Mol C. An iterative thresholding algorithm forlinear inverse problems with a sparsity constraint. Communications on Pure andApplied Mathematics,2004,57:1413-1457.
    [43] Hale E T, Yin W, Zhang Y. Fixed-point continuation for1-minimization: method-ology and convergence. SIAM Journal on Optimization,2008,19:1107-1130.
    [44] Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linearinverse problems. SIAM Journal of Imaging Sciences,2009,2:183-202.
    [45] Donoho D, Tsaig Y. Fast solution of1-norm minimization problems when thesolution may be sparse. IEEE Transactions on Information Theory,2008,54:4789-4812.
    [46] Cande`s E, Braun N, Wakin M B.Sparse signal and image recovery from com-pressive samples.In Proceedings of the4th lEEE International Symposium onBiomedical Imaging:From Nano to Macro.Washington D.C., USA,2007,976-979.
    [47] Wright S J, Nowak R D, Figueiredo M A T. Sparse reconstruction by separableapproximation. IEEE Transactions on Signal Processing,2009,57:2479-2493.
    [48] Yin W T, Osher S, Goldfarb D, Darbon J. Bregman iterative algorithms for1-Minimization with applications to compressed sensing. SIAM Journal on ImagingSciences,2008,1:143-168.
    [49] Yang J F, Zhang Y. Alternating direction algorithms for1problems in compressivesensing, SIAM Journal on Scientifc Computing,2011,33:250-278.
    [50] Zou H. The adaptive lasso and its oracle properties. Journal of the AmericanStatistical Association,2006,101:1418–1429.
    [51] Chartrand R, Staneva V. Restricted isometry properties and nonconvex compres-sive sensing, Inverse Problems,2008,24:20–35.
    [52] Chartrand R. Nonconvex regularization for shape preservation. IEEE InternationalConference on Image Processing,2007,1:293-296.
    [53] Xu Z B, Zhang H, Wang Y, Chang X Y. L1/2regularizer. Science in China SeriesF,2009,52:1-9.
    [54] Chartrand R. Exact reconstruction of sparse signals via nonconvex minimization,IEEE Signal Processing Letters,2007,14:707-710.
    [55] Ge D D, Jiang X Y, Ye Y Y. A note on the complexity of Lpminimization.Mathematical Programming Series B,2011,129:285-299.
    [56] Chen X J, Xu F M, Ye Y Y. Lower bound theory of nonzero entries in solutions of2-pminimization. SIAM Journal on Scientifc Computing,2010,32:2832-2852.
    [57] Cande`s E, Wakin M B, Boyed S. Enhancing sparsity by reweighted1minimiza-tion, Journal of Fourier Analysis and Applications,2008,14:877-905.
    [58] Chen X J, Zhou W J. Convergence of reweighted L1minimization algorithms andunique solution of truncated Lpminimization. Technical Report, HK Polytech.Univ.,2010.
    [59] Chen X J, Zhou W J. Smoothing nonlinear conjugate gradient method for imagerestoration using nonsmooth nonconvex minimization. SIAM Journal on ImagingSciences,2010,3:765-790.
    [60] Lai M J, Wang J Y. An unconstrained Lqminimization with0    [61] Krishnan D, Fergus R. Fast image deconvolution using hyper-Laplacian priors. inNeural Information Processing Systems. Cambridge, MA: MIT Press,2009.
    [62] Xu Z B, Guo H L, Wang Y, Zhang H. The representation of L1/2regularizer amongLq(0    [63] Xu Z B, Chang X Y, Xu F M, Zhang H. L1/2regularization: A thresholdingrepresentation theory and a fast solver. IEEE Transactions on Neural Networksand Learning Systems,2012,23:1013-1027.
    [64] Dai Y H, Liao L Z. R-linear convergence of the Barzilai and Borwein gradientmethod. IMA Journal of Numerical Analysis,2002,22:1-10.
    [65] Fletcher R. On the Barzilai-Borwein method. Optimization and control with ap-plications, Applied Optimization, Springer, New York,2005,235-256.
    [66] Barzilai J, Borwein J. Two point step size gradient methods. IMA Journal ofNumerical Analysis,1988,8:141-148.
    [67] Grippo L, Lampariello F, Lucidi S. A nonmonotone line search technique for New-ton’s method. SIAM Journal on Numerical Analysis,1986,23:707-716.
    [68] Pander E R, Tits A L. Avoiding the Maratos efect by means of nonmonotone linesearch. SIAM Journal on Numerical Analysis,1991,28:1183-1195.
    [69] Dai Y H. A nonmonotone conjugate gradient algorithm for unconstrained opti-mization. Journal of Syststems Science and Complexity,2002,15:139-145.
    [70] Dai Y H. On the nonmonotone line search. Journal of Optimization Theory andApplications,2002,112:315-33.
    [71] Zhang H C, Hager W. A nonmonotone line search technique and its application tounconstrained optimization. SIAM Journal on Optimization,2004,14:1043-1056.
    [72] Grippe L, Lamparielo F, Lucidi S. A truncated Newton method with nomnonotoneline search for unconstrained optimization. Journal of Optimization Theory andApplications,1989,60:401-419.
    [73] Raydan M. The Barzilai and Borwein gradient method for the large scale uncon-strained minimization problem. SIAM Journal on Optimization,1997,7:26-33.
    [74] Birgin E G, Martinez J M, Raydan M. Nonmonotone spectral projected gradientmethods on convex sets. SIAM Journal on Optimization,2000,10:1196-1211.
    [75] Raydan M. On the Barzilai and Borwein choice of the steplength for the gradientmethod. IMA Journal on Numerical Analysis,1993,13:321-326.
    [76] Birgin E G, Martinez J M. Large-scale active-set box-constrained optimizationmethod with spectral projection gradients. Computational Optimization and Ap-plications,2002,23:101-125.
    [77] Hale E T, Yin W, Zhang Y. Fixed-point continuation applied to compressed sens-ing: implementation and numerical experiments. Journal of Computational Math-ematics,2010,28:170-194.
    [78] Figueiredo M, Nowak R. An EM algorithm for wavelet-based image restoration,IEEE Transactions on Image Processing,2003,12:906-916.
    [79] Figueiredo M, Nowak R. A bound optimization approach to wavelet-based imagedeconvolution. IEEE International Conference on Image Processing,2005,2:782-785.
    [80] Xiao Y H, Wang Q, Hu Q. Non-smooth equations based method for1-normproblems with applications to compressed sensing. Nonlinear Analysis,2011,74:3570-3577.
    [81] Zhang L, Zhou W J. Spectral gradient projection method for solving nonlinearmonotone equations. Journal of Computational and Applied Mathematics,2006,196:478-484.
    [82] Solodov M V, Svaiter B F. A globally convergent inexact Newton method forsystems of monotone equations. in: M. Fukushima, L. Qi (Eds.), Reformulation:Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods, Kluwer A-cademic Publishers,1998,355-369.
    [83] Chen X J, Xiang S H. Computation of error bounds for P-matrix linear comple-mentarity problems. Mathematical Programming,2006,106:513-525.
    [84] Li D H, Fukushima M. A modifed BFGS method and its global convergence innonconvex minimization. Journal of Computational and Applied Mathematics,2001,129:15-35.
    [85]袁亚湘,孙文瑜.最优化理论与方法.北京:科学出版社,1997.
    [86]李董辉,童小娇,万中.数值最优化.北京:科学出版社,2005.
    [87] Petukhov A. Fast implementation of orthogonal greedy algorithm for tight waveletframes. Signal Processing,2006,86:471-479.
    [88] Needell D, Vershynin R. Uniform uncertainty principle and signal recovery viaregularized orthogonal matching pursuit. Foundations of Computational Mathe-matics,2009,9:317-334.
    [89] Van den Berg E, Friedlander M. Probing the pareto frontier for basis pursuitsolutions. SIAM Journal on Scientifc Computing,2008,31:890-912.
    [90] Frank M, Wolfe P. An algorithm for quadratic programming, Naval Research Lo-gistics Quarterly,1956,3:95-110.
    [91] Zoutendijk G. Methods of Feasible Directions, Elsevier, Amsterdam,1960.
    [92] Polyk E. Computational Method in Optimization. Academic Press, New York,1971.
    [93] Topkis D, Veinnott A. On the convergence of some feasible direction algorithmsfor nonlinear programming. SIAM Journal on Control and Optimization,1967,5:268-279.
    [94] Zukhoviskii S, Polak R, Primak M. An algorithm for the solution of convex pro-gramming problems. Doklady Akademii Nauk SSSR,1963,153:991-1000.
    [95] Canon M, Cullum C. A tight upper bound on the rate of convergence of the Frank-Wolfe algorithm. SIAM Journal on Control and Optimization,1968,6:509-516.
    [96] Pironneau O, Polak E. On the rate of convergence of certain method of centers,Mathematical Programming,1972,2:230-257.
    [97] Pironneau O, Polak E. Rate of convergence of a class of methods of feasible direc-tions. SIAM Journal on Numerical Analysis,1973,10:161-174.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700