基于统计学习理论的多核预测模型研究及应用

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

基于统计学习理论的多核预测模型研究及应用

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：The Research and Application of Multiple Kernel Prediction Model Based on Statistical Learning Theory
作者：邵喜高
论文级别：博士
学科专业名称：数学
中文关键词：统计学习理论 ; 单向序贯最小优化算法 ; 多核学习 ; 支持向量回归 ; 在线学习 ; 时间序列预测
英文关键词：statistical learning theory ; single directional SMO ; multiple kernel learning ; support vector regression ; online learning ; time series forecasting
学位年度：2013
导师：武坤
学科代码：0701
学位授予单位：中南大学
论文提交日期：2013-06-01

摘要

近几年,用学习理论解决数据分析问题已成为统计学的研究趋势之一。问题规模与复杂性日增的现实,需要更具效率的学习方法。本篇博士论文在统计学习理论的框架下,应用核方法,提出几种新的学习思路,建立了一套多核的支持向量回归机,并成功的应用于实践预测。与传统的学习方法相比较,所设计的新思路更具效率,能以低的计算代价取得期望的预测效果。
     传统的核学习方法包含的一个主要问题是建立学习模型后采取什么样的优化算法。基于此,本文的第一个创新点是在求解最小二乘支持向量机模型时,针对工作集的选择提出了一个单向收敛序贯最小优化算法(SD-SMO),该算法要求在迭代过程中仅优化一个拉格朗日乘子,使目标函数基于拉格朗日乘子的梯度单向收敛于0。在标准数据集上的数据试验表明,SD-SMO几乎没有降低学习精度,能有效减少算法迭代的次数,降低了计算成本。
     为了克服传统的核学习方法需要选择具体核的难题,针对多源数据或异构数据,研究者提出了多核学习方法(MKL)。多核学习的核通常是一族核函数的组合,由核函数族构建的学习核得出预测模型,传统的多核学习是基于l1范数,但模型的稀疏解降低了模型的预测精度。本文的第二个创新点是在l1范数的多核学习基础上,将其推广,得出基于lp(p>1)范数的多核支持向量回归模型,该模型能有效克服稀疏化解降低学习精度的问题,提高了预测效果。针对提出的基于lp(p>1)范数的多核支持向量回归模型求解,本文给出了一个相互交错、轮流优化的算法。在真实的经济数据上的试验结果表明,该方法预测效果比单核支持向量回归及基于l1,范数的多核支持向量回归模型都有明显提高。
     对于机理复杂、时变性强的数据,在线多核学习方法近来得到了研究者的青睐,也是当前机器学习领域中的又一个研究热点。基于此,本文的最后一个创新点是构建了一个在线多核学习预测框架,并给出了一系列的算法和理论分析。优化算法采取的是两种在线学习算法的融合,考虑到在线学习过程中计算成本逐渐增加的问题,应用了加权随机抽样策略,降低了计算成本。在标准时间序列数据集上的实证分析表明,在线多核支持向量回归预测模型以较大的计算成本获得了较好的预测效果；在采取随机抽取策略以后,能保证较高的预测精度,使学习时间明显减少。
     以学习理论解决数据分析问题将给统计学带来新的活力,本论文在此方面做了有益的尝试,所得成果的应用不仅仅局限于预测,其构造思路与相关理论技术也可以推广到其他学习领域,论文所做的工作丰富了数据分析处理的理论方法,对统计实践具有一定的指导意义。
With the development of economy, science, and technology, the ask of statistics be-comes more complex, and the size of the date set involved is on the increase. To cope with these challenges, we develop several fast kernel machines in the dissertation, which are successfully used in time series forecasting. Compared with the existing methods, the resulting methods can obtain better performance. For each machine, we make a clear theoretical analysis and carry out a series of experiments to evaluate its performance.
     A main problem for traditional kernel learning method is to take what kind of optimization algorithm after a learning model is established. Based on this, in order to solve LSSVM, a new technique for the selection of working set in sequential minimal optimization (SMO)-type decomposition methods is proposed. By the new method, we can select a single direction to achieve the convergence of the optimality condition. A simple asymptotic convergence proof for the new algorithm is given. Experimental comparisons demonstrate that the classification accuracy of the new method is not large different from the existing methods, but the training speed is faster than existing ones.
     In order to overcome the difficulty:the traditional kernel learning method needs to choose the specific kernel function, the researchers put forward multiple kernel learning (MKL) method for multi-source data or heterogeneous data. The kernel functions for MKL are the combination of kernel ones. The traditional MKL is based on (?)1norm, but the sparse solution will reduce the prediction accuracy of the model.(?)1-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. So to allow for robust kernel mixtures that generalize well, we adopt-norm multiple kernel support vector regression (1≤p<∞) as a stock price prediction model. The optimization problem is decomposed into smaller sub-problems and the interleaved optimization strategy is employed to solve the regression model. Experimental results show that our proposed model performs better than (?)1-norm multiple sup-port vector regression model.
     For the complex, time-varying data, online MKL has got the favor of the researchers, and now it has become a hot research topic in the field of machine learning. Finally, an online multi-kernel prediction framework is proposed, and the corresponding algorithm and theory analysis is given. The optimized algorithm is based on the fusion of two online learning ones. Considering the calculation cost increases gradually in the process of online learning process, Weighted Ran-dom Sampling strategy is adopted. It can reduce the computational cost and keep the forecast accuracy. The test results in standard time series data set show that the online multi-kernel sup-port vector regression prediction model obtains a good forecasting effect with large computation costs. After the random strategy is adopted, the learning accuracy has hardly fallen at all, but can significantly reduce learning time.
     To solve the problem of data analysis with learning theory will bring new vitality to s-tatistics, this dissertation has made a beneficial attempt in this aspect. The application of these achievements is not only confined to the forecast, the ideas and relevant theory, technology can also be extended to other learning areas. Learning from data prompts the development of s-tatistics, and in turn, the development of statistics provides more theoretic support for learning. The theory of data analysis and processing method for statistical practice has a certain guiding significance.

引文

[1]V.N. Vapnick. Statistical learning theory [J]. J. Wiley and Sons Inc. Nova York,1998.
    [2]B. Scholkopf and A.J. Smola. Learning with kernels:Support vector machines, regular-ization, optimization, and beyond [M]. MIT press,2001.
    [3]B.E. Boser, I.M. Guyon, and V.N. Vapnik. A training algorithm for optimal margin clas-sifiers [C]. ACM,1992, pages 144-152.
    [4]I. Guyon and A. Elisseeff. An introduction to variable and feature selection [J]. The Journal of Machine Learning Research,2003,3:1157-1182.
    [5]WU Zhengpeng and ZHANG Xuegong. Feature resealing of support vector machines [J].清华大学学报(英文版),2011,(4).
    [6]V.N. Vapnik and A.J. Chervonenkis. The necessary and sufficient conditions for consis-tency of the method of empirical risk [J]. Pattern Recognition and Image Analysis,1991, 1(3):284-305.
    [7]V.N. Vapnik and A.Y. Chervonenkis. Necessary and sufficient conditions for the uniform convergence of means to their expectations [J]. Theory of Probability & Its Applications, 1982,26(3):532-553.
    [8]CJ.C. Burges. A tutorial on support vector machines for pattern recognition [J]. Data mining and knowledge discovery,1998,2(2):121-167.
    [9]N. Cristianini and J. Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods [M]. Cambridge university press,2000.
    [10]A.J. Smola and B. Scholkopf. A tutorial on support vector regression [J]. Statistics and computing,2004,14(3):199-222.
    [11]田英杰,邓乃扬.数据挖掘中的新方法——支持向量机,2004.
    [12]C.M. Bishop et al. Neural networks for pattern recognition [J].1995.
    [13]Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to docu-ment recognition [J]. Proceedings of the IEEE,1998,86(11):2278-2324.
    [14]J. Weston and C. Watkins. Multiclass support vector machines, univ. london, u.-k. Tech-nical report, m Tech. Rep. CSD-TR-98-04,1998.
    [15]Y. Guermeur, A. Elisseeff, and H. Paugam-Moisy. A new multi-class svm based on a uniform convergence result [C]. IEEE,2000,4:183-188.
    [16]K. Crammer and Y. Singer. On the algorithmic implementation of multiclass kernel-based vector machines [J]. The Journal of Machine Learning Research,2002,2:265-292.
    [17]J.A.K. Suykens and J. Vandewalle. Least squares support vector machine classifiers [J]. Neural processing letters,1999,9(3):293-300.
    [18]O.L. Mangasarian et al. Generalized support vector machines [J]. ADVANCES IN NEU-RAL INFORMATION PROCESSING SYSTEMS,1999, pages 135-146.
    [19]O.L. Mangasarian, D.R. Musicant, et al. Data discrimination via nonlinear generalized support vector machines [J]. Complementarity:Applications, Algorithms and Extensions, 2001,50:233-251.
    [20]YJ. Lee and O.L. Mangasarian. Ssvm:A smooth support vector machine for classification [J]. Computational optimization and Applications,2001,20(1):5-22.
    [21]O.L. Mangasarian and D.R. Musicant. Lagrangian support vector machines [J]. The Journal of Machine Learning Research,2001,1:161-177.
    [22]YJ. Lee and O.L. Mangasarian. Rsvm:Reduced support vector machines [C]. SIAM Philadelphia,2001, pages 5-7.
    [23]C.F. Lin and S.D. Wang. Fuzzy support vector machines [J]. Neural Networks, IEEE Transactions on,2002,13(2):464-471.
    [24]K. Kan and C. Shelton. Catenary support vector machines [J]. Machine Learning and Knowledge Discovery in Databases,2008, pages 597-610.
    [25]Yongping Zhao and Jianguo Sun. Recursive reduced least squares support vector regres-sion [J]. Pattern Recognition,2009,42(5):837-842.
    [26]S. Kim, J. Kim, S. Yun, and C.D. Yoo. v-structured support vector machines [C]. IEEE, 2010, pages 450-455.
    [27]H. Benbrahim. Fuzzy semi-supervised support vector machines [J]. Machine Learning and Data Mining in Pattern Recognition,2011, pages 127-139.
    [28]R. Strack and V. Kecman. Minimal norm support vector machines for large classification tasks [C]. IEEE,2012,1:209-214.
    [29]C. Cortes and V. Vapnik. Support-vector networks [J]. Machine learning,1995, 20(3):273-297.
    [30]E. Osuna, R. Freund, and F. Girosi. An improved training algorithm for support vector machines [C]. IEEE,1997, pages 276-285.
    [31]C.C. Chang, C.W. Hsu, and C.J. Lin. The analysis of decomposition methods for support vector machines [J]. Neural Networks, IEEE Transactions on,2000,11(4):1003-1008.
    [32]T.Joachims. Svmlight:Support vector machine [J]. SVM-Light Support Vector Machine http://svmlight. joachims. org/, University of Dortmund,1999,19:4.
    [33]J. Platt et al. Sequential minimal optimization:A fast algorithm for training support vector machines [J].1998.
    [34]S. Ruping. Incremental learning with support vector machines [C]. IEEE,2001, pages 641-642.
    [35]G. Cauwenberghs and T. Poggio. Incremental and decremental support vector machine learning [J]. Advances in neural information processing systems,2001, pages 409-415.
    [36]R. Polikar, L. Upda, SS Upda, and V. Honavar. Learn++:An incremental learning algo-rithm for supervised neural networks [J]. Systems, Man, and Cybernetics, Part C:Appli-c(?)ns and Reviews, IEEE Transactions on,2001,31(4):497-508.
    [37]J.A.K. Suykens, J. De Brabanter, L. Lukas, and J. Vandewalle. Weighted least squares sup-port vector machines:robustness and sparse approximation [J]. Neurocomputing,2002, 48(1):85-105.
    [38]S.S. Keerthi and S.K. Shevade. Smo algorithm for least-squares svm formulations [J]. Neural computation,2003,15(2):487-507.
    [39]W. Chu, C.J. Ong, and S.S. Keerthi. An improved conjugate gradient scheme to the solu-tion of least squares svm [J]. Neural Networks, IEEE Transactions on,2005,16(2):498-501.
    [40]R.E. Fan, P.H. Chen, and C.J. Lin. Working set selection using second order information for training support vector machines [J]. The Journal of Machine Learning Research, 2005,6:1889-1918.
    [41]L. Jiao, L. Bo, and L. Wang. Fast sparse approximation for least squares support vector machine [J]. Neural Networks, IEEE Transactions on,2007,18(3):685-697.
    [42]L. Bo, L. Jiao, and L. Wang. Working set selection using functional gain for 1s-svm [J]. Neural Networks, IEEE Transactions on,2007,18(5):1541-1544.
    [43]L. Jian, Z. Xia, X. Liang, and C. Gao. Design of a multiple kernel learning algorithm for 1s-svm by convex programming [J]. Neural Networks,2011,24(5):476-483.
    [44]A. Barbero, J. Lopez, and J.R. Dorronsoro. Cycle-breaking acceleration of svm training [J]. Neurocomputing,2009,72(7):1398-1406.
    [45]A. Barbero and J.R. Dorronsoro. Cycle-breaking acceleration for support vector regression [J]. Neurocomputing,2011,74(16):2649-2656.
    [46]V. Vapnik, S.E. Golowich, and A. Smola. Support vector method for function approxi-mation, regression estimation, and signal processing [J]. Advances in neural information processing systems,1997, pages 281-287.
    [47]A.J. Smola and B. Schoelkopf. A tutorial on support vector regression,1998 [J]. Available on-line:http://www.svms.org/regression/SmSc98. pdf.
    [48]C.C. Chang and CJ. Lin. Training v-support vector regression:theory and algorithms [J]. Neural Computation,2002,14(8):1959-1977.
    [49]N.A. Syed, H. Liu, and K.K. Sung. Handling concept drifts in incremental learning with support vector machines [C]. ACM,1999, pages 317-321.
    [50]L. Csato and M. Opper. Sparse representation for gaussian process models [J]. Advances in neural information processing systems,2001, pages 444--4-50.
    [51]J. Kivinen, A.J. Smola, and R.C. Williamson. Online learning with kernels [J]. Signal Processing, IEEE Transactions on,2004,52(8):2165-2176.
    [52]J. Ma, J. Theiler, and S. Perkins. Accurate on-line support vector regression [J]. Neural Computation,2003,15(11):2683-2703.
    [53]V. Cherkassky and F.M. Mulier. Learning from data:concepts, theory, and methods [M]. Wiley-IEEE Press,2007.
    [54]V. Cherkassky and F. Mulier. Learning from data, concepts, theory and methods,1998.
    [55]B. Scholkopf, J. Burges, and A. Smola. Advances in kernel methods:Support vector machine,1999.
    [56]D.E. Goldberg. Genetic algorithms in search, optimization, and machine learning [J]. 1989.
    [57]D.B. Fogel. An introduction to simulated evolutionary optimization [J]. Neural Networks, IEEE Transactions on,1994,5(1):3-14.
    [58]L. Cao. Support vector machines experts for time series forecasting [J]. Neurocomputing, 2003,51:321-339.
    [59]E. Alba and B. Dorronsoro. The exploration/exploitation tradeoff in dynamic cellular genetic algorithms [J]. Evolutionary Computation, IEEE Transactions on,2005,9(2):126-142.
    [60]M. Aurnhammer and KD Tonnies. A genetic algorithm for automated horizon correlation across faults in seismic images [J]. Evolutionary Computation, IEEE Transactions on, 2005,9(2):201-210.
    [61]S. Venkatraman and G.G. Yen. A generic framework for constrained optimization using genetic algorithms [J]. Evolutionary Computation, IEEE Transactions on,2005,9(4):424-435.
    [62]H. Min, H. Jeung Ko, and C. Seong Ko. A genetic algorithm approach to developing the multi-echelon reverse logistics network for product returns [J]. Omega,2006,34(1):56-69.
    [63]K.Y. Chen and C.H. Wang. Support vector regression with genetic algorithms in forecast-ing tourism demand [J]. Tourism Management,2007,28(1):215-226.
    [64]E. Pourbasheer, S. Riahi, M.R. Ganjali, and P. Norouzi. Application of genetic algorithm-support vector machine (ga-svm) for prediction of bk-channels activity [J]. European journal of medicinal chemistry,2009,44(12):5023-5028.
    [65]C.H. Wu, G.H. Tzeng, and R.H. Lin. A novel hybrid genetic algorithm for kernel func-tion and parameter optimization in support vector regression [J]. Expert Systems with Applications,2009,36(3):4725-4735.
    [66]P.F. Pai and W.C. Hong. Software reliability forecasting by support vector machines with simulated annealing algorithms [J]. Journal of Systems and Software,2006,79(6):747-755.
    [67]C. Jin. Software reliability prediction based on support vector regression using a hybrid genetic algorithm and simulated annealing algorithm [J]. Software, IET,2011,5(4):398-405.
    [68]W.C. Hong. Chaotic particle swarm optimization algorithm in a support vector regres-sion electric load forecasting model [J]. Energy Conversion and Management,2009, 50(1):105-117.
    [69]Q. Wu. A hybrid-forecasting model based on gaussian support vector machine and chaotic particle swarm optimization [J]. Expert Systems with Applications,2010,37(3):2388-2394.
    [70]D. Niu, Y. Wang, and D.D. Wu. Power load forecasting using support vector machine and ant colony optimization [J]. Expert Systems with Applications,2010,37(3):2531-2539.
    [71]周亚同,张太镒,and刘海员.基于核的机器学习方法及其在多用户检测中的应用[J].通信学报,2005,26(007)：96-108.
    [72]N. Aronszajn. Theory of reproducing kernels [J]. Trans. Amer. Math. Soc,1950, 68(3):337-404.
    [73]A. Aizerman, E.M. Braverman, and LI Rozoner. Theoretical foundations of the potential function method in pattern recognition learning [J]. Automation and remote control,1964, 25:821-837.
    [74]B. Scholkopf. Support vector learning [J].1997.
    [75]B. Scholkopf, A. Smola, and K.R. Muller. Nonlinear component analysis as a kernel eigenvalue problem [J]. Neural computation,1998,10(5):1299-1319.
    [76]D. Haussler. Convolutional kernels on discrete structures, technical reportucsc-crl-99-10 [J]. Computer Science Dept., UC Santa Cruz,1999.
    [77]周伟达.核机器学习方法研究[D].西安：西安电子科技大学电子工程学院,2003.
    [78]牟少敏.核方法的研究及其应用[D].北京：北京交通大学,2008.
    [79]S. Sonnenburg, G. Ratsch, C. Schafer, and B. Scholkopf. Large scale multiple kernel learning [J]. The Journal of Machine Learning Research,2006,7:1531-1565.
    [80]Q. Wu, Y. Ying, and D.X. Zhou. Multi-kernel regularized classifiers [J]. Journal of Complexity,2007,23(1):108-134.
    [81]T. Damoulas and M.A. Girolami. Probabilistic multi-class multi-kernel learning:On protein fold recognition and remote homology detection [J]. Bioinformatics,2008, 24(10):1264-1270.
    [82]T. Joachims. Training linear svms in linear time [C]. ACM,2006, pages 217-226.
    [83]T. Mitsumori, M. Murata, Y. Fukuda, DOI Kouichi, and DOI Hirohumi. Extracting protein-protein interaction information from biomedical text with svm [J]. IEICE TRANS-ACTIONS on Information and Systems,2006,89(8):2464-2466.
    [84]I. Chan, W. Wells m, R.V. Mulkern, S. Haker, J. Zhang, K.H. Zou, S.E. Maier, and C.M.C. Tempany. Detection of prostate cancer by integration of line-scan diffusion, t2-mapping and t2-weighted magnetic resonance imaging; a multichannel statistical classifier [J]. Medical physics,2003,30:2390.
    [85]C. Davatzikos, Y. Fan, X. Wu, D. Shen, and S.M. Resnick. Detection of prodromal alzheimer's disease via pattern classification of magnetic resonance imaging [J]. Neu-robiology of aging,2008,29(4):514-523.
    [86]J.L. Rojo-Alvarez, G. Camps-Valls, M. Martinez-Ramon, E. Soria-Olivas, A. Navia-Vazquez, and A.R. Figueiras-Vidal. Support vector machines framework for linear signal processing [J]. Signal Processing,2005,85(12):2316-2326.
    [87]A. Ganapathiraju, J. Hamaker, J. Picone, et al. Hybrid svm/hmm architectures for speech recognition [C]. Citeseer,2000.
    [88]A. Ganapathiraju, J.E. Hamaker, and J. Picone. Applications of support vector machines to speech recognition [J]. Signal Processing, IEEE Transactions on,2004,52(8):2348-2355.
    [89]Y.L. Lin and G. Wei. Speech emotion recognition based on hmm and svm [C]. IEEE, 2005,8:4898-4901.
    [90]C. Cusano, G. Ciocca, and R. Schettini. Image annotation using svm [C]. International Society for Optics and Photonics,2003, pages 330-338.
    [91]P.J. Moreno, P. Ho, and N. Vasconcelos. A kullback-leibler divergence based kernel for svm classification in multimedia applications [J]. Advances in Neural Information Pro-cessing Systems,2003,16:1385-1393.
    [92]M. Song and D. Civco. Road extraction using svm and image segmentation [J]. Pho-togrammetric engineering and remote sensing,2004,70(12):1365-1371.
    [93]A. Plaza, J.A. Benediktsson, J.W. Boardman, J. Brazile, L. Bruzzone, G. Camps-Valls, J. Chanussot, M. Fauvel, P. Gamba, A. Gualtieri, et al. Recent advances in techniques for hyperspectral image processing [J]. Remote Sensing of Environment,2009,113:S110-S122.
    [94]H.L. Wan and M. Chowdhury. Image semantic classification by using svm [J]. Journal of software,2012,14(11):1891-1899.
    [95]D. Li, K.D. Wong, Y.H. Hu, and A.M. Sayeed. Detection, classification, and tracking of targets [J]. Signal Processing Magazine, IEEE,2002,19(2):17-29.
    [96]H. Deng, Q.A. Zeng, and D.P. Agrawal. Svm-based intrusion detection system for wireless ad hoc networks [C]. IEEE,2003,3:2147-2151.
    [97]K. Flouri, B. Beferull-Lozano, and P. Tsakalides. Training a svm-based classifier in dis-tributed sensor networks [C].2006, pages 1-5.
    [98]W. Kim, J.H. Yoo, and H.J. Kim. Multi-target tracking using distributed svm training over wireless sensor networks [C]. IEEE,2012, pages 2439-2444.
    [99]J.P. Fern&. Realistic subsurface anomaly discrimination using electromagnetic induction and an svm classifier [J].2010.
    [100]G. Oliveri, P. Rocca, and A. Massa. Svm for electromagnetics:State-of-art, potentialities, and trends [C]. IEEE,2012, pages 1-2.
    [101]N.E. Huang, Z. Shen, S.R. Long, M.C. Wu, H.H. Shih, Q. Zheng, N.C. Yen, C.C. Tung, and H.H. Liu. The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis [J]. Proceedings of the Royal Society of London. Series A:Mathematical, Physical and Engineering Sciences,1998,454(1971):903-995.
    [102]S. Makridakis, A. Andersen, R. Carbone, R. Fildes, M. Hibon, R. Lewandowski, J. New-ton, E. Parzen, and R. Winkler. The accuracy of extrapolation (time series) methods: results of a forecasting competition [J]. Journal of forecasting,2006,1(2):111-153.
    [103]W. Hardle and P. Vieu. Kernel regression smoothing of time series [J]. Journal of Time Series Analysis,2008,13(3):209-232.
    [104]H. Fan and Q. Song. A sparse kernel algorithm for online time series data prediction [J]. Expert Systems with Applications,2012.
    [105]Z. Xu, Q. Song, F. Haijin, and D. Wang. Online prediction of time series data with recur-rent kernels [C]. IEEE,2012, pages 1-7.
    [106]L.B. Tang, L.X. Tang, and H.Y. Sheng. Forecasting volatility based on wavelet support vector machine [J]. Expert Systems with Applications,2009,36(2):2901-2909.
    [107]L.B. Tang, H.Y. Sheng, and L.X. Tang. Garch prediction using spline wavelet support vector machine [J]. Neural computing & applications,2009,18(8):913-917.
    [108]M. Sewell and J. Shawe-Taylor. Forecasting foreign exchange rates using kernel methods [J]. Expert Systems with Applications,2012.
    [109]Z. Xuegong. Introduction to statistical learning theory and support vector machines [J]. Acta Automatica Sinica,2000,26(1):32-42.
    [110]V. Vapnik. The nature of statistical learning theory [M]. springer,1999.
    [111]C. Carmeli, E. De Vito, and A. Toigo. Vector valued reproducing kernel hilbert spaces of integrable functions and mercer theorem [J]. Analysis and Applications,2006,4(04):377-408.
    [112]唐发明.基于统计学习理论的支持向量机算法研究[D].武汉：华中科技大学,2005.
    [113]B.R. Chang. A tunable epsilon-tube in support vector regression for refining parameters of gm (1,1-τ) prediction model-svrgm (1,1-τ) approach [C]. IEEE,2003,5:4700-4704.
    [114]Marvin Minsky and Papert Seymour. Perceptrons [J].1969.
    [115]邓乃扬,田英杰,et al.数据挖掘中的新方法:支持向量机[M].科学出版社,2004.
    [116]Tony Van Gestel, Johan AK Suykens, Bart Baesens, Stijn Viaene, Jan Vanthienen, Guido Dedene, Bart De Moor, and Joos Vandewalle. Benchmarking least squares support vector machine classifiers [J]. Machine Learning,2004,54(1):5-32.
    [117]Xiangyan Zeng and Xue-wen Chen. Smo-based pruning methods for sparse least squares support vector machines [J]. Neural Networks, IEEE Transactions on,2005,16(6):1541-1546.
    [118]Hikmet Esen, Filiz Ozgen, Mehmet Esen, and Abdulkadir Sengur. Modelling of a new solar air heater through least-squares support vector machines [J]. Expert Systems with Applications,2009,36(7):10673-10682.
    [119]Chull Hwan Song, Seong Joon Yoo, Chee Sun Won, and Hyoung Gon Kim. Svm based indoor/mixed/outdoor classification for digital photo annotation in a ubiquitous computing environment [J]. Computing and Informatics,2012,27(5):757-767.
    [120]Kok Seng Chua. Efficient computations for large least square support vector machine classifiers [J]. Pattern Recognition Letters,2003,24(1):75-80.
    [121]Leonardo V Ferreira, Eugenius Kaszkurewicz, and Amit Bhaya. Solving systems of linear equations via gradient systems with discontinuous righthand sides:application to 1s-svm [J]. Neural Networks, IEEE Transactions on,2005,16(2):501-505.
    [122]John C Platt.12 fast training of support vector machines using sequential minimal opti-mization [J].1999.
    [123]S. Sathiya Keerthi, Shirish Krishnaj Shevade, Chiranjib Bhattacharyya, and Karuturi Rad-ha Krishna Murthy. Improvements to platt's smo algorithm for svm classifier design [J]. Neural Computation,2001,13(3):637-649.
    [124]Pai-Hsuen Chen, Rong-En Fan, and Chih-Jen Lin. A study on smo-type decomposition methods for support vector machines [J]. Neural Networks, IEEE Transactions on,2006, 17(4):893-908.
    [125]G Ratsch. Benchmark repository [J]. Intelligent Data Analysis Group, Fraunhofer-FIRST, Tech. Rep,2005.
    [126]Chih-Chung Chang and Chih-Jen Lin. Libsvm:a library for support vector machines [J]. ACM Transactions on Intelligent Systems and Technology (TIST),2011,2(3):27.
    [127]Y Takane, H Hwang, and Y Oshima-Takane. A kernel method for multiple-set canonical correlation analysis [J]. Submitted to IEEE Transactions on Neural Networks,2001.
    [128]Ofir Barzilay and VL Brailovsky. On domain knowledge and feature selection using a support vector machine [J]. Pattern Recognition Letters,1999,20(5):475-484.
    [129]Huma Lodhi, Craig Saunders, John Shawe-Taylor, Nello Cristianini, and Chris Watkins. Text classification using string kernels [J]. The Journal of Machine Learning Research, 2002,2:419-444.
    [130]Koji Tsuda, Taishin Kin, and Kiyoshi Asai. Marginalized kernels for biological sequences [J]. Bioinformatics,2002,18(suppl 1):S268-S275.
    [131]Paul Pavlidis, Jason Weston, Jinsong Cai, and William Noble Grundy. Gene functional classification from heterogeneous data [C]. ACM,2001, pages 249-255.
    [132]Gert RG Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, and Michael I Jordan. Learning the kernel matrix with semidefinite programming [J]. The Journal of Machine Learning Research,2004,5:27-72.
    [133]Alain Rakotomamonjy, Francis Bach, St6phane Canu, and Yves Grandvalet. More effi-ciency in multiple kernel learning [C]. ACM,2007, pages 775-782.
    [134]Wenwu He, Zhizhong Wang, and Hui Jiang. Model optimizing and feature selecting for support vector regression in time series forecasting [J]. Neurocomputing,2008,72(l):600-611.
    [135]Francis R Bach, Gert RG Lanckriet, and Michael I Jordan. Multiple kernel learning, conic duality, and the smo algorithm [C]. ACM,2004, page 6.
    [136]Soren Sonnenburg, Gunnar Ratsch, and Christin Schafer. A general and efficient multiple kernel learning algorithm [J].2006.
    [137]Alain Rakotomamonjy, Francis Bach, Stephane Canu, Yves Grandvalet, et al. Simplemkl [J]. Journal of Machine Learning Research,2008,9:2491-2521.
    [138]Zenglin Xu, Rong Jin, Irwin King, and Michael R Lyu. An extended level method for efficient multiple kernel learning [J]. Advances in neural information processing systems, 2009,21:1825-1832.
    [139]Francis Bach. Exploring large feature spaces with hierarchical multiple kernel learning [J]. arXiv preprint arXiv.0809.1493,2008.
    [140]Corinna Cortes, Mehryar Mohri, and Afshin Rostamizadeh. Learning non-linear combi-nations of kernels [J]. Advances in Neural Information Processing Systems,2009,22:396-404.
    [141]Corinna Cortes. Can learning kernels help performance? [C].2009.
    [142]N Shawe-Taylor and A Kandola. On kernel target alignment [J]. Advances in neural information processing systems,2002,14:367.
    [143]Jaz Kandola, John Shawe-Taylor, and Nello Cristianini. On the extensions of kernel align-ment [J].2002.
    [144]Corinna Cortes, Mehryar Mohri, and Afshin Rostamizadeh. Two-stage learning kernel algorithms [C].2010, pages 239-246.
    [145]Dimitri P Bertsekas. Nonlinear programming [J].1999.
    [146]Marius Kloft, Ulf Brefeld, Soren Sonnenburg, and Alexander Zien. Lp-norm multiple kernel learning [J]. The Journal of Machine Learning Research,2011,999999:953-997.
    [147]Alexander Zien and Cheng Soon Ong. Multiclass multiple kernel learning [C]. ACM, 2007, pages 1191-1198.
    [148]SVN Vishwanathan, Zhaonan Sun, Nawanol Theera-Ampornpunt, and Manik Varma. Multiple kernel learning and the smo algorithm [J]. Advances in neural information pro-cessing systems,2010,23(1-9):2361-2369.
    [149]Francesco Orabona, Luo Jie, and Barbara Caputo. Multi kernel learning with online-batch optimization [J]. The Journal of Machine Learning Research,2012,13:227-253.
    [150]Francis EH Tay and Lijuan Cao. Application of support vector machines in financial time series forecasting [J]. Omega,2001,29(4):309-317.
    [151]Francis X Diebold and Robert S Mariano. Comparing predictive accuracy [J]. Journal of Business & economic statistics,2002,20(1).
    [152]Wenwu He and Zhizhong Wang. Direct simplification for kernel regression machines [J]. Neurocomputing,2008,71 (16):3602-3606.
    [153]Hui Jiang and Wenwu He. Grey relational grade in local support vector regression for financial time series prediction [J]. Expert Systems with Applications,2012,39(3):2256-2262.
    [154]Zhenyu Chen, Jianping Li, Liwei Wei, et al. A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue [J]. Artificial Intelligence in Medicine,2007,41(2):161.
    [155]Jingjing Yang, Yuanning Li, Yonghong Tian, Lingyu Duan, and Wen Gao. Group-sensitive multiple kernel learning for object categorization [C]. IEEE,2009, pages 436-443.
    [156]Zhe Wang, Songcan Chen, and Tingkai Sun. Multik-mhks:A novel multiple kernel learn-ing algorithm [J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2008, 30(2):348-353.
    [157]Yan Song, Yan-Tao Zheng, Sheng Tang, Xiangdong Zhou, Yongdong Zhang, Shouxun Lin, and Tat-Seng Chua. Localized multiple kernel learning for realistic human action recognition in videos [J]. Circuits and Systems for Video Technology, IEEE Transactions on,2011,21 (9):1193-1202.
    [158]Chuan-Xian Ren, Dao-Qing Dai, and Hong Yan. Low resolution facial image recognition via multiple kernel criterion [C]. IEEE,2011, pages 204-208.
    [159]Wenwu He. Limited stochastic meta-descent for kernel-based online learning [J]. Neural computation,2009,21(9):2667-2686.
    [160]Andr6 FT Martins, Mario AT Figueiredo, Pedro MQ Aguiar, Noah A Smith, and Eric P Xing. Online multiple kernel learning for structured prediction [J]. arXiv preprint arXiv:1010.2770,2010.
    [161]Steven CH Hoi, Rong Jin, Peilin Zhao, and Tianbao Yang. Online multiple kernel classi-fication [J]. Machine Learning,2013,90(2):289-316.
    [162]陈坤.在线核学习建模算法及应用研究[D].浙江大学,2011.
    [163]Pavel Laskov, Christian Gehl, Stefan Kruger, and Klaus-Robert Muller. Incremental sup-port vector learning:Analysis, implementation and applications [J]. The Journal of Ma-chine Learning Research,2006,7:1909-1936.
    [164]Daniel D Lee and H Sebastian Seung. Learning the parts of objects by non-negative matrix factorization [J]. Nature,1999,401(6755):788-791.
    [165]Konrad Rieck, Pavel Laskov, and Klaus-Robert Miiller. Efficient algorithms for similarity measures over sequential data:A look beyond kernels. In Pattern Recognition, pages 374-383. Springer,2006.
    [166]贺文武.在线核学习的一般形式探讨[J].福建工程学院学报,2010,8(4)：400-403.
    [167]Koby Crammer, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, and Yoram Singer. On-line passive-aggressive algorithms [J]. The Journal of Machine Learning Research,2006, 7:551-585.
    [168]Claudio Gentile. The robustness of the p-norm algorithms [J]. Machine Learning,2003, 53(3):265-299.
    [169]Adam J Grove, Nick Littlestone, and Dale Schuurmans. General convergence results for linear discriminant updates [J]. Machine Learning,2001,43(3):173-210.
    [170]Claudio Gentile. A new approximate maximal margin classification algorithm [J]. The Journal of Machine Learning Research,2002,2:213-242.
    [171]Koby Crammer and Yoram Singer. Ultraconservative online algorithms for multiclass problems [J]. The Journal of Machine Learning Research,2003,3:951-991.
    [172]Peilin Zhao, Steven CH Hoi, and Rong Jin. Double updating online learning [J]. Journal of Machine Learning Research,2011,12:1587-1615.
    [173]Jialei Wang, Peilin Zhao, and Steven CH Hoi. Exact soft confidence-weighted learning [J]. arXiv preprint arXiv:1206.4612,2012.
    [174]Pavlos S Efraimidis and Paul G Spirakis. Weighted random sampling with a reservoir [J]. Information Processing Letters,2006,97(5):181-185.
    [175]Li Cheng SVN Vishwanathan Dale Schuurmans and Shaojun Wang Terry Caelli. Implicit online learning with kernels [C]. The MIT Press,2007,19:249.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700