用户名: 密码: 验证码:
基于流形的降维方法及其在计算机视觉中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
真实世界应用中的许多高维数据都能被建模成为位于低维的线性/非线性流形附近的数据点。以图像数据集为例,数据集中的潜在变化对应于诸如物体姿态、光照或人脸表情等等的连续物理变化。从那些流形上带噪声采样得来的数据点中发现流形的结构,是非监督学习中非常具有挑战性的问题。近年来,人机交互方面与流形相关的方法已经成为了日益热点的研究领域。
     真实世界的流形也与人的视觉感知密切相关。人工神经网络是对真实世界流形进行建模和解释的强有力工具。通过操纵所学习到流形的低维自由参数,我们还能够合成或者估计出我们所期望的真实世界数据。这种学习和合成的双向(“bi-directional”)过程与典型的人类认知行为非常近似。在人类的行为中通常都是从无组织的观察中学习,然后使用学习到的知识去推测未知的事物。
     传统的降维方法如主元分析(PCA)受制于它的线性性质。其他的方法,包括自组织图、流形学习和核方法等,被提出来处理低维流形的非线性性质。但是这些方法又有他们自身的局限。本文的方法从那些前人的工作中受到启发并且进行了改进。按照所描述的潜在低维结构复杂程度递增的顺序,本文的主要贡献如下:
     1.提出了一种新的基于自相关矩阵的均值更新增量主元分析算法。这种方法在使用了在输入数据表示上的两个变换。更新的特征子空间进行重新居中,而无需重新计算旧数据的自相关矩阵。旧信息所需的存储空间和自相关矩阵的维数保持恒定,而不是随着输入数据的总数增加。在更新完成后不需要存储旧的数据。与目前已有的方法比较,本文提出的方法对于视觉中,要求更低计算时间的子空间学习和识别任务是一个好的选择。
     2.提出了一种新的计算高效的局部主元分析算法来结合NGAS-PCA和PCA-SOM的优点。每一个局部单元都有与之对应的平均向量和协方差矩阵。算法中使用的新的竞争度量隐式地结合了重构误差和输入数据到单元中心的距离。在该算法中,数据分布的学习过程中消除了额外的主元空间更新步骤。该模型适用于非线性的模式学习和回忆。在算法训练过程完成之后,数据分布被表示成为了一系列的局部线性单元。并且在这种模式表示中不需要关于最优主元空间的先验信息。
     3.提出了一种新的变形模型,即泛化的拓扑保持自组织图(gTPSOM),来将拓扑保持的自组织映射机制引入神经元竞争的变形模型。这种模型是从视觉感应自组织图(ViSOM)中获得启发。在ViSOM中,数据的映射在神经元图上同时保持了输入数据点之间的距离和整个输入空间的拓扑结构。本文提出的gTPSOM模型由对局部边界变化施加约束的自适应力场来并行驱动的。算法通过区域辅助的活动轮廓(Region Aided Active Contour)和水平集(Level Sets)方法来实现。gTPSOM模型适用于精确的边界检测和具有较强边界强度起伏的复杂形状恢复。
     4.提出了一种基于流形的新方法来建一个输入空间和特征空间之间的非线性映射,从而不再孤立的考虑流形的学习和合成。这个非线性映射是由在输入空间中建立局部生成单元模型并且在特征空间中构建全局仿射变换来实现的。这种形式的方法导致样本外数据点在输入空间和特征空间之间来回转换可以由简单的解析解得到。本文的方法避免了在流形学习与双向样本外数据扩展中的交替最小二乘解或局部极小的问题。此外,本文的方法还能估计潜在的流形维数,且对最近邻居的个数有较好的鲁棒性。
Many high dimensional data in real world applications can be modelled as data points lying close to a low dimensional linear/nonlinear manifold. The underlying variations in image data sets correspond to the continuous physical changes such as pose, illumination of objects or expressions of the human faces. Discovering the manifold structure from a set of noisy data points sampled from the manifold is very challenging in the unsupervised learning. Recently, the manifold related methods for human-computer interaction have become an increasingly hot research area.
     The human visual perception is also closely related to the real world manifold. Artificial neural networks can be used as a powerful tool for modelling and interpreting the manifold structures in real world data. By manipulating the low dimensional free parameters of the learned manifold, one can also synthesize or estimate the expected real-world data. The "bi-directional" process of learning and synthesis is very much akin to the typical human cognitive behaviors, in which one learns from the unorganized observations and infers the unknown using the learned knowledge.
     Traditional dimensionality reduction techniques such as Principal Component Analysis (PCA) subject to linearity. Other methods including Self-Organizing Maps (SOM), manifold learning, and kernel method, have been developed to deal with the nonlinearity of the low-dimensional manifolds. But these methods have their limitations. Our approach draws inspiration from and improves upon the pioneering work. The main contributions of this thesis are as follows:
     1. A new mean-shifting Incremental PCA method is proposed based on the autocorrelation matrix. This method employs two transformations on the representation of the training data. The updated eigen-subspace is re-centered without recompute the autocorrelation matrix of the old data. Moreover, the storage requirement of the old information and the dimension of the autocorrelation matrix remain constant instead of increasing with the number of total input data. There's no need to store the old data after the updating. Compared to the existing algorithms, the proposed method is computational efficient for applications such as visual subspace learning and recognition.
     2. A new computational efficient Local PCA algorithm is proposed to combine the advantages of NGAS-PCA and PCA-SOM. Each unit is associated with its mean vector and covariance matrix. The new competition measure implicitly incorporate the reconstruction error and distance between the input data and the unit center. In the proposed algorithm, the extra updating step of the principal subspaces is eliminated from the data distribution learning process. One potential application of the proposed model is the nonlinear pattern learning and recalling. After the training process, the data distribution is represented by a collection of local linear units. No priori information for the optimal principal subspace is needed for the pattern representation.
     3. A deformable model, i.e. the generalized Topology Preserving SOM (gTP-SOM), is proposed to incorporate the Topology Preserving Self-Organizing Mapping into the neuron competition. It is inspired by the Visual Induced Self-Organizing Map (ViSOM) where the mapping preserves the inter-point distances of the input data on the neuron map as well as the topology. The gTPSOM is driven in parallel by an adaptive force field, which imposes constrains on the local boundary variation. Region aided active contour and Level sets are employed to implement the proposed model. The gTPSOM model is suitable for both the precise edge detection and the complex shape recovery with boundary strength variation.
     4. A new manifold based method is proposed to construct a nonlinear mapping between the input and the feature space, instead of considering manifold learning and synthesis in isolation. The nonlinear mapping is realized by modelling the Local Generative units in the input space and a Global Affine transformation in the feature space. These formulations result in simple solutions to transverse between the input space and the feature space for the Out-of-Sample data points. The proposed method avoids the Alternating Least Squares problem or local minima for both the manifold learning process and the bidirectional Out-of-Sample extension. Moreover, this method can estimate the underlying dimension and is robust to the number of neighbors.
引文
[1] J.B. Tenenbaum, V. de Silva and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction, Science, 290: 2319-2323, 2000.
    [2] S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding, Science, 290: 2323-2326, 2000.
    [3] B. Scholkopf, A. Smola, K.-R. Muller, Kernel Principal Component Analysis, Advances in Kernel Methods-Support Vector Learning, 1999, MIT Press Cambridge, MA, USA, 327-352. ISBN 0-262-19416-3
    [4] S. Wu, K. Hamaguchi and S. Amari, Dynamics and Computation of Continuous Attractors, Neural Computation, 20(4):994-1025, 2008.
    [5] S. Wu and S. Amari, Computing with Continuous Attractors: Stability and On-Line Aspects, Neural Computation, 17:2215-2239, 2005.
    [6] E. T. Rolls, S. M. Stringer, Invariant visual object recognition: A model, with lighting invariance, Journal of Physiology-Paris, 100:43-62, 2006.
    [7] S. M. Stringer and E. T. Rolls, Invariant Object Recognition in the Visual System with Novel Views of 3D Objects, Neural Computation, 14:2585-2596,2002.
    [8] S. M. Stringer, E. T. Rolls, T. P. Trappenberg , I. E. T. de Araujo, Self-organizing continuous attractor networks and motor function, Neural Networks,16(2):161-182, 2003.
    [9] S.M. Stringer, E.T. Rolls, T.P. Trappenberg and I.E.T. de Araujo, Self-Organizing Continuous Attractor Networks and Motor Function, Neural Networks, 16 ,161-182,2003.
    [10] H.S. Seung, Continous attractors and occulomotor control, Neural Networks, 11:1253-1258, 1988.
    [11] H.S. Seung, How the brain keeps the eyes still, Proc. National Academy of Science USA, 93: 13339-13344, 1996.
    [12] H.S. Seung, Learning continuous attractors in recurrent networks. Adv. Neural Info. Proc. Syst, 10:654-60, 1998.
    [13] H. S. Seung and D.D. Lee. The manifold ways of perception, Science, 290:2268-2269, 2000.
    [14] I.T., Jolliffe. Principal Component Analysis. Series: Springer Series in Statistics, 2nd ed., Springer, NY, 2002, XXIX, 487 p. 28 illus. ISBN 978-0-387-95442-4.
    [15] B. Scholkopf, A. Smola, K.-R. M(?)ller, Nonlinear component analysis as a kernel eigenvalue Problem, Neural Computation, 10, 1299-1319, 1998.
    [16] N. Lawrence. Gaussian process latent variable models for visualization of high dimensional data. Advances in Neural Information Processing Systems, 16:329-336, The MIT Press, 2004.
    [17] A. Gorban, B. Kegl, D. Wunsch, A. Zinovyev (Eds.), Principal Manifolds for Data Visualisation and Dimension Reduction, LNCSE 58, Springer, Berlin -Heidelberg - New York, 2007, ISBN 978-3-540-73749-0.
    [18] T. D., Sanger, Optimal Unsupervised Learning in A Single-Layer Linear Feed-forward Neural Network, Neural Networks, 2:459-473, 1989.
    [19] Oja, E. A Simplified Neuron Model as a Principal Component Analyzer, J. Math. Biology, 15:267-273, 1982.
    [20] Bannour, S. and Azimi-Sadjadi, R. M. Principal Component Extraction Using Recursive Least Squares Learning, IEEE Trans. Neural Networks, 6(2):457-469, 1995.
    [21] Yi, Z., Fu, Y. and Hua, H. J. Neural Networks Based Approach For Computing Eigenvectors and Eigenvalues of Symmetric Matrix, Computers and Mathematics with Applications, 47:1155-1164, 2004
    [22] Weng, J., Zhang, Y. and Hwang, W. S. Candid Covariance-free Incremental Principal Component Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(8):1034-1040, 2003.
    [23] Hall, P., Marshall, D. and Martin, R. Merging and Spliting Eigenspace Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(9):1042-1048, 2000.
    [24] Lim, J., Ross, D., Lin, R. S. and Yang, M. H. Incremental Learning for Visual Tracking, Advances in NIPS, 2004.
    [25] Chin, T. J. and Suter, D, Incremental Kernel Principal Component Analysis,IEEE Transactions on Image Processing, 16(6):1662-1674, 2007.
    [26] Zhao, H., Yuen, P. C. and Kwok, J. T. A Novel Incremental Principal Component Analysis and Its Application for Face Recognition, IEEE Transactions on Systems, Man, And Cybernetics-Part B: Cybernetics, 36(4):873-886, 2006.
    [27] Li, Y. On Incremental and Robust Subspace Learning, Pattern Recognition, 37(7):1509-1518, 2004.
    [28] Skocaj, D. and Leonardis, A. Incremental and Robust Learning of Subspace Representations, Image Vision Computing, 1-12, 2006.
    [29] Wolf, L. and Shashua, A., Learning Over Sets Using Kernel Principal Angles, Journal of Machine Learning Research, 4:913-931, 2003.
    [30] Artac, M., Jogan, M. and Leonardis, A. Incremental PCA for on-line visual learning and recognition, Proceedings of the 16th International Conference on Pattern Recognition, 3:781-784, 2002.
    [31] Liu, Q., Cheng, J., Lu, H. and Ma, S., Distance Based Kernel PCA Image Reconstruction, Proceedings of the 17th International Conference on Pattern Recognition(ICPR 2004), 3:670-673, 2004.
    [32] Chandrasekaran, S., Manjunath, B. S., Wang, Y. F., Winkeler, J., and Zhang, H., An eigenspace update algorithm for image analysis, Graphical Models and Image Processing, 59(5):321-332, 1997.
    [33] Brand, M., Incremental singular value decomposition of uncertain data with missing values, Proceedings of the seventh European conference on computer vision (ECCV 2002), 2350:707-720, 2002.
    [34] Levy, A. and Lindenbaum, M., Sequential Karhunen-Loeve basis extraction and its application to images, IEEE Transactions on Image Processing, 9:1371-1374, 2000.
    [35] Y. LeCun, The MNIST database of handwritten digits, NEC Research Institute, 1998. http://yann.lecun.com/exdb/mnist/index.html
    [36] N. Kambhatla, T.K. Leen, Dimension reduction by local principal component analysis, Neural Computation ,9(7), 1493-1516 1997.
    [37] B. Kegl, A. Krzyzak, T. Linder and K. Zeger, Learning and design of principal curves, IEEE Trans. Pattern Anal. Mach. Intell, 33(3), 281-297, 2000.
    [38] R. Moller and H. Hoffmann, An extension of neural gas to local PCA, Neurocomputing, 62, 305-326, 2004.
    [39] Ezequiel Lopez-Rubio, Jose Munoz-Perez and Jose Antonio Gomez-Ruiz, A principal components analysis self-organizing map, Neural Networks, 17, 261-270,2004.
    [40] E. Berglund and J. Sitte, The Parameterless self-organizing map algorithm, IEEE Trans. Neural Networks, 17(2),305-316,2006.
    [41] M.E. Tipping, CM. Bishop, Mixtures of probabilistic principal component analyzers, Neural Comput, 11 (2),443-482,1999.
    [42] A. Weingessel, K. Hornik, Local PCA algorithms, IEEE Trans. Neural Networks, 11 (6), 1242-1250,2000.
    [43] T. Kohonen, The adaptive-subspace SOM (ASSOM) and its use for the implementation of invariant feature detection. In F. Fogelman-Soulie' , P. Galniari (Eds.), Proceedings of the ICANN' 95, International Conference on Artificial Neural Networks, Paris: EC2 and Cie, 1, 3-10,11995.
    [44] S. Ouyang, Z. Bao and G.-S. Liao, Robust recursive least squares algorithm for principal component analysis, IEEE Trans. Neural Networks, 11(1),215-221,2000.
    [45] D. Cremers, T. Kohlberger, C. Schnorr, Shape statistics in kernel space for variational image segmentation, Pattern Recognition, 36, 1929-1943,2003.
    [46] E. Alpaydin, Soft vector quantization and the EM algorithm, Neural Networks, 11(3), 467-477,1998.
    [47] R. Moller, Interlocking of learning and orthonormalization in RRLSA, Neurocomputing, 49 (1-4), 429-433,2002.
    [48] Y. LeCun, The MNIST database of handwritten digits, NEC Research Institute, 1998. http://yann.lecun.com/exdb/mnist/index.html
    [49] M. Kass, A. Witkin, and D. Terzopoulos, Snakes: Active contour models, Int. J. Computer Vision, 1, 321-331,1987.
    [50] D. Terzopoulos, A. Witkin, and M. Kass, Constraints on Deformable Models: Recovering 3D Shape and Nonrigid Motion, Artificial Intelligence, 36,91-123,1988.
    [51] T. Mclnerney and D. Terzopoulos, Deformable Models in Medical Image Analysis: A Survey, Medical Image Analysis, 1, 91-108,1996.
    [52] J. Suri, K. Liu, S. Singh, S. Laxminarayan, X. Zeng, and L. Reden, Shape Recovery Algorithms Using Level Sets in 2-D/3D Medical Imagery: A State of the Art Review, IEEE Trans. Information Technology in Biomedicine, 6, 8-28,2002.
    [53] J.L. Mallet, Discrete Smooth Interpolation in Geometric Modelling, Computer Aided Design, 24, 178-191,1992.
    [54] D. Terzopoulos, J. Platt, A. Barr, and K. Fleischer, Elastically Deformable Models, ACM Computer Graphics, 21, 205-214,1987.
    [55] N. Paragios and R. Deriche, Geodesic Active Contours for Supervised Texture Segmentation, Proc. Conf. Computer Vision Pattern Recognition,422-427,1999.
    [56] V. Caselles. R. Kimmel, G. Sapiro, Geodesic active contours, ICCV, 694-699,1995.
    [57] C. Xu, J. Prince, Gradient vector flow: a new external force for snakes, CVPR,66-71,1997.
    [58] N. Paragios, O. Mellina-Gottardo and V. Ramesh, Gradient vector flow fast geodesic active contours, IEEE Trans. Pattern Analysis and Machine Intelligence, 26 (10)402-407, 2004.
    [59] H. Yin, ViSOM - A Novel Method for Multivariate Data Projection and Structure Visualization, IEEE Trans. Neural Networks, 13 (1), 237-243,2002.
    [60] Y. V. Venkatesh and N. Rishikesh, Self organizing neural networks based on spatial isomorphism for active contour modeling, Pattern Recognition, 33, 1239-1250,2000.
    [61] Y. V. Venkatesh, S. Kumar Raja, and N. Ramya, Multiple Contour Extraction From Gray level Images Using an Artificial Neural Network, IEEE Trans. Image Processing, 14 (4), 892-899,2006.
    [62] I. Valovaa, D. Szera, N. Gueorguievab and A. Buer, A parallel growing architecture for self-organizing maps with unsupervised learning, Neurocomputing 68 (2005) 177-195.
    [63] A. Jalba, M. Wilkinson and J. Roerdink, CPM: A deformable model for shape recovery and segmentation based on charged particles, IEEE Trans. Pattern Analysis and Machine Intelligence 26 (10) (2004) 1320-1335.
    [64] R. Yang and M. Mirmehdi, A Charged Geometric Model for Active Contours, International Conference on Pattern Recognition 2 (2006) 183-186.
    [65] R. Yang, M. Mirmehdi and D. Hall, A charged Contour Model for Cardiac SPECT Segmentation, Medical Image Understanding and Analysis (2006).
    [66] R. Yang, M. Mirmehdi, X. Xie, "A Charged Active Contour based on Electrostatics, Advanced Concepts for Intelligent Vision Systems (2006) 137-184.
    [67] T. Kohonen. Self-organizing Maps, 3rd Edition, Springer-Verlag, 2000.
    [68] T. Hastie and W. Stuetzle. Principal curves, Journal of The American Statistical Association, 84: 502-516, 1988.
    [69] T. Cox and M. Cox. Multidimensional Scaling, Chapman Hall. London, 1994.
    [70] L. K. Saul and S. T. Roweis. Think globally, fit locally: unsupervised learning of low dimensional manifolds, Journal of Machine Learning Research, 4:119-155, 2003.
    [71] K. Weinberger and L. Saul. Unsupervised Learning of Image Manifolds by Semidefinite Programming, Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2:988-995, 2004.
    [72] F. Sha and L.K. Saul. Analysis and Extension of Spectral Methods for Nonlinear Dimensionality Reduction, Proc. Int'l Conf. Machine Learning, 785-792,2005.
    [73] X. Li, S. Lin, S. Yah and D. Xu. Discriminant Locally Linear Embedding With High-Order Tensor Data, IEEE Trans. Systems, Man, and Cybernetics, Part B, 38(2):342-352, 2008.
    [74] X. Geng, D. Zhan and Z. Zhou. Supervised nonlinear dimensionality reduction for visualization and classification, IEEE Trans. Systems, Man, and Cybernetics, Part B, 35(6):1098-1107, 2005.
    [75] H. Zhang, W. Huang, Z. Huang and B. Zhang. A Kernel Autoassociator Approach to Pattern Classification, IEEE Trans. Systems, Man, and Cybernetics,Part B, 35(3):593-606, 2005.
    [76] X. Fu and L. Wang. Data dimensionality reduction with application to simplifying RBF network structure and improving classification performance, IEEE Trans. Systems, Man, and Cybernetics, Part B, 33(3):399-409, 2003.
    [77] Z. Zhang and H. Zha. Principal Manifolds and Nonlinear Dimension Reduction via Tangent Space Alignment, SIAM Journal of Scientific Computing, 26 (1): 313-338, 2004.
    [78] T. Lin and H. Zha. Riemannian Manifold Learning, IEEE Trans.Pattern Analysis and Machine Intelligence, 30(5) 796-809, 2008.
    [79] Y. Bengio, M. Monperrus and H. Larochelle. Nonlocal Estimation of Manifold Structure, Nueral Computation, 18(10): 2509-2528, 2006.
    [80] Y. Bengio, J-F. Paiement, and P. Vincent. Out-of-sample extensions for LLE, isomap, MDS, eigenmaps, and spectral clustering, NIPS, 15, 2003.
    [81] P. Doll(?)r, V. Rabaud and S. Belongie, Non-Isometric Manifold Learning: Analysis and an Algorithm, In Proceedings of International Conference on Machine Learning, 241-248, 2007.
    [82] T. Kohonen, Self-organization and associative memory, Springer-Verlag New York, Inc. New York, NY, USA 1989.
    [83] J. L(o|¨)fberg. YALMIP: A Toolbox for Modeling and Optimization in MATLAB,In Proceedings of the CACSD Conference, Taipei, Taiwan, 2004.
    [84] D. B. Borchers. CSDP, a C library for semidefinite programming, Optimization Methods and Software, 11(1): 613-623, 1999.
    [85] S. A. Nene, S. K. Nayar and H. Murase. Columbia Object Image Library (COIL-20), Technical Report CUCS-005-96, February 1996.
    [86] K. Q. Weinberger and L. K. Saul. Unsupervised learning of image manifolds by semidefinite programming, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR-04), Washington D.C., 2004.
    [87] J.T.-Y Kwok, I.W.-H Tsang. The Pre-image Problem In Kernel Methods, IEEE Trans. Neural Networks, 15 (6): 1517-1525, 2004.
    [88] J. Ham, D.D. Lee, S. Mika and B. Scholkopf. A kernel view of the dimensionality reduction of manifolds, Proc. of the 21th International Conference on Machine Learning, 369-376, 2004.
    [89] K. Q. Weinberger, F. Sha, and L. K. Saul. Learning a kernel matrix for nonlinear dimensionality reduction, Proc. the 21th International Confernence on Machine Learning (ICML-04)Banff, Canada, 839-846, 2004.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700