空间数据聚类的研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

空间数据聚类的研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Cluster Research on Spatial Data
作者：孙志伟
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：数据挖掘 ; 聚类 ; 约束 ; 自组织映射 ; 网格方法 ; 密度方法
英文关键词：Data Mining ; Cluster ; Constrain ; Self-Organizing Map ; Grid-based algorithm ; Density-based algorithm
学位年度：2007
导师：赵政
学科代码：081203
学位授予单位：天津大学
论文提交日期：2006-12-01

摘要

面对大规模的高维数据和各种约束条件,如何建立有效、可扩展的聚类算法是数据挖掘领域的一个研究热点。围绕以上问题,本文对聚类算法进行了深入研究,主要包括以下几个方面的内容:
     通过对密度类算法和网格类算法的分析,提出一种基于网格和密度综合的算法CluGD,算法使用网格方法得到反映数据空间的代表点,代表点不是实际的数据点,而是表示数据点特征的虚拟点,然后通过密度类算法对代表点进行聚类,此算法采用的参数和DBSCAN算法相同,由于采用了网格方法,算法的效率得到很大提高;又使用随机样例的方法针对参考点提出了算法GDRS;由于大规模数据中的密度变化较大,单一密度不能准确的刻画数据空间的特征,由CluGD扩展提出了算法VCluGD,此算法使用预处理过程给出数据空间内给定邻域半径后点密度和点数量的关系图,方便用户设置多级参数,从而进行多级聚类,获得较好的聚类效果。三种算法都具有对于数据集大小的线性时间复杂度,适合大规模数据的聚类问题。
     通过研究和分析处理非空间约束聚类算法的一些优点和不足,对DBSCAN算法进行扩展提出了DBSCAN+算法,然后针对非空间的高维特性提出使用SOM神经网络算法辅助进行处理,DBSCAN+对非空间数据采用按照不同数据类型分别计算相异度的方法,并给出了试验结果。辅助进行处理的方法是首先使用SOM对高维数据聚类的维进行选择,然后对候选维使用DBSCAN+算法聚类,或在候选维的基础上使用SOM方法进行非空间聚类,并把SOM和DBSCAN+两种算法的聚类结果相结合,试验表明算法是有效的。
     针对现有空间约束聚类的缺点,提出一种可以处理空间约束的算法DBOF。该方法对空间约束分为三种对象:障碍对象,通达对象,既具有障碍作用又具有通达作用的对象。对于障碍对象采用多边形建模,对于通达对象采用图拓扑的方法建模,对第三种约束采用带有穿越点属性的图拓扑进行建模。对于障碍对象采用完整的障碍距离衡量两点之间的距离,而对于后两种约束由于采用了图拓扑的方法,有利于算法的实际应用。试验结果表明,DBOF算法具有更好的聚类效果,并且具有良好的运行效率。
Facing a large-scale, high-dimensional data and all kinds of contrains, how to build effective and scalable data mining clustering algorithms is one of hot research directions of data mining. Aiming at above issues, some clustering algorithms have been studied substantially as follows:
     Based on the analysis of density-based algorithms and grid-based algorithms, three algorithms are proposed, which are CluGD, GDRS and VCluGD. The CluGD algorithm firstly gets representative points, and then clusters the representative points through density method. Here the representative point is not the actual data points, but virtual data points reflecting the data space. Although this algorithm adopts the same parameters as DBSCAN algorithm, it greatly improved the efficiency because of using grid method. The GDRS algorithm employs random sample method to manage the representative points. As single parameter can’t accurately reflect the internal characteristics of the data space because of a lot changes in the density of large-scale data, the VCluGD algorithm extends the CluGD algorithm. The VCluGD algorithm gets a relationship graph between density and the number of points by using a pretreatment process based on the neighborhood radius. This algorithm is convenient for users to set up multi-level parameters, and has better effect of clustering. The executing efficiencies of these three algorithms are linear time for the size of data sets and they are all suitable for large-scale clustering.
     Through studies and analysis the strengths and weaknesses of the clustering algorithms which can manage constrains of non-spatial attributes, the DBSCAN+ algorithm is proposed based on DBSCAN, and then the paper proposes to adopt the SOM algorithm for auxiliary managing high-dimensional non-spatial attributes. According to the data types of non-spatical attributes, DBSCAN+ algorithm calculates the dissimilarity of diffirent data types, and then the experiment results are shown. Auxiliary method is that firstly using the SOM algorithm to choose the proper dimension for the aim of clustering, then the DBSCAN+ algorithm clusters based on these candidate dimensions, or the SOM algorithm directly clusters these candidate dimensions, and then the cluster results of this two cluster algorithms are combined. The experiment results show that the mehod is effective.
     In view of the shortcomings of the existing cluster algorithms on spatial constraints, DBOF is proposed to deal with the spatial contrains. In this algorithm, the spatial contrains are marked as obstacle, facility, both obstacle and facility. Polygon model is adopted to deal with the obstacle, and graphical structure is used to manage the facility, and for the objects of the third, the especial graphical structure with attributes of traversing points is used to express it. The complete obstacle distance is used to measure the distance between two obstacles, and graphical structures are used to model the other two constrains, so it is benefical to the practical application of the DBOF algorithm. The experiments show that the DBOF clustering algorithm can get better results and have high efficiency.

引文

[1] J.Han,M.Kamber,Data Mining: Concepts and Techniques,北京:高等教育出版社,2001
    [2] A.K Jain., M. N Murty., P.J.Flynn, Data Clustering: A Review, ACM Computing Surveys, 1999, 31(3):264~323
    [3] Chen M-S, Han J., Yu P S., Data mining: an overview from a database perspective, IEEE Transactions on Knowledge and Data Engineering, 1996, 8(6):866~883
    [4]J.P.Bigus, Data Mining with Neural Networks, New York: McGraw-Hill, 1996.
    [5]袁增任,人工神经元网络及其应用,北京:清华大学出版社,2000
    [6]Qualian JR, Induction of decision trees, Machine Learning, 1986, 1:81~106
    [7]李敏强,遗传算法的基本理论与应用,北京:科学出版社,2002
    [8]曾黄麟,粗集理论及其应用,重庆:重庆大学出版社,1996
    [9] A.K.Jain, R.C.Dubes, Algorithms for Clustering Data, NJ, Prentice-Hall, 1988
    [10] 谢昆青,马修军,杨冬青,空间数据库,北京: 机械工业出版社, 2004
    [11] Nievergelt, J., hinterberger, H., and Sevcil, K., The grid file: An adaptable, symmetric multikey file structure, ACM Transactions on Database Systems, 1984 9(1):28~71
    [12]Antomn Guttman.R-Trees: A Dynamic index structure for spatial searching. Proc.ACM SIGMOD, Boston USA, 1984, 47~57
    [13] Beckmann N. The R*-tree: An Efficient and Robust Access Method for Points and Rectangles. Proceedings of the 1990 ACM SIGMOD Cont, 1990,6: 322~331
    [14] Timos, S. and Nick, R. and Christos, F,The R+-tree: A Dynamic Index for Multi-dimensional Objects,Proceedings of the 13th VLDB Conference,1987,507~518
    [15] Schneider, R. and Kriegel, H.P.,The TR*-Tree: A New Representation of Polygonal Objects Supporting Spatial Queries and Operations, Proceedings of the International Workshop on Computational Geometry-Methods, Algorithms and Applications,1991, 249~263
    [16] White DA, Jain R. Similarity indexing with the SS-tree. Proc. 12th IEEE International Conference on Data EngineeringNew Orleans, Louisiana, 1996, 3: 516~523
    [17] Berchtold, S. and Keim, D.A. and Kriegel, H.P., The X-tree: An Index Structure for High-Dimensional Data, Proceedings of the 22th International Conference on Very Large Data Bases,1996,28~39
    [18] Faloutsos, C. and Sellis, T., Analysis of object oriented spatial access methods. In Poceedings of the ACM conferecce on Principles of Database Systems, 1987, 426~439
    [19] Kamel I. and Faloutsos C., On Packing r-trees, In CIKM’1993, Proceedings of the Second International Conference on Information and Knowledge Management, 1993:521~522
    [20] Leutenegger, S.T. and Lopez, M.A., The effect of buffering on the performance of r-trees. IEEE Transactions on Knowledge and Data Engineering, 2000, 12(1):33~44
    [21] Theodoridis,Y., Stefanakis,E., and Sellis T, Efficient cost models for spatial queries using r-trees, IEEE Transactions on Knowledge and Data Engineering, 12(1):19~32
    [22]J.MacQueen, Some methods for classification and analysis of multivariate observations, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967, 1:281~297
    [23]S.L.Lauriten.The EM algorithm for graphical association medels with missing data.Computational Statistics and Data Analysis, 1995, 19:191~201
    [24]Z.Huang.Extensions to the k-means algorithm for clustering large data sets with categorical values,Data Mining and Knowledge Discovery,1998,2:283~304
    [25] Raymond T.Ng, Jiawei Han, CLARANS: A Method for Clustering Objects for Spatial Data Mining, IEEE Transactions on knowledge and data engineering, 2002,15(4):1003~1016
    [26]L.Kaufman and P.j.Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley Series in Probability and Mathematical Statistics, Applied Probability and Statistics, New York: Wiley, 1990
    [27] P.Bradley, U.Fayyad, and C.Reina, Scaling Clustering algorithms to Large databases, In Proc.1998 Int.Conf.Knowledge Discovery and Data Mining, New York, 1998:9~15.
    [28]T.Zhang, R.Ramakrishnan, and M.Livny, BIRCH: An efficient data clustering method for very large databases, In Proc.1996 ACM-SIGMOD Int.Conf.Management of Data, Montreal, Canada, 1996:103~114.
    [29] Guba S, Rastogi R, Shim K, “CURE: an efficient clustering algorithm for large databases”. In: Haas LM, Tiwary A, eds. Proceeding of the ACM SIGMOD International Conference on Management of Data, Seattle: ACM Press, 1998,73~84
    [30] Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim, Rock: A robust clustering algorithm for categorical attributes, Proceedings of the 15th International Conference on Data Engineering, Sydney, Australia, 1999, 512~521
    [31]Qian Yun-Tao, Shi Qing-Song, Wang Qi, CURE-Ns: A hierarchical clustering algorithm with new shrinking scheme, 2002 International Conference on Machine Leaming and Cybernetics,2002,895~899
    [32]Tao C.W., Unsupervised fuzzy clustering with multi-center clusters, Fuzzy Sets and Systems, 2002, 128(3):305~322
    [33]陈恩红,王上飞,一种利用代表点的有效聚类算法设计与实现,模式识别与人工智能,2001, 14(4):417~422
    [34]ZHOU HaoFeng, YUAN QingQing, CHENG ZunPing,PHC: A Fast Partition and Hierarchy-Based Clustering Algorithm, 18(3):408~411
    [35]Sprenger T.C., Brunella R., Gross M.H., H-BLOB: A hierarchical visual clustering method using implicit surfaces, Proceedings of the IEEE Visualization Conference, 2000:61~68
    [36]Sander J, Qin XJ, Lu ZY etc., Automatic extraction of clusters from hierarchical clustering representations, Advances In Knowledge Discovery and Data Mining Lecture Notes in Artificial Intelligence 2003,26(1):75~87
    [37] Ding C, He XF, Cluster merging and spliting in hierarchical clustering algorithms, 2002 IEEE International Conference on Data Mining, 2002,139~146
    [38] Dash M, Liu H, Scheuermann P etc., Fast hierarchical clustering and its validation,Data&Knowledge Engineering,2003,44(1):109~138
    [39] Dong YH, Hierarchical clustering algorithm based on neighborhood-linked in large spatial databases, Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing Lecture Notes in Artificial Intelligence, 2003, 26(6):619~622
    [40] G.Karypis, E.h.Han and V.kumar. CHAMELEON: A hierarchical clustering algorithm using dynamic modeling.COMPUTER, 1999 32:68~75
    [41]Estvill-Castro, V.and Lee, I.j. AUTOCLUST: Automatic Clustering via Boundary Extraction for Mining Massive Point-Data Sets, In proc.of the 5th intl.Conf.On Geocomputaton(2000),2000:23~25.
    [42] M Ester, H.-P Kriegel, J Sander, Xiaowei Xu, “A density based algorithm for discovering clusters in large spatial databases with noise”, In: Proc. of Knowledge Discovery and Data Mining, Portland, AAAI Press, 1996:226~231
    [43]M.Ankerst, M.Breuning, H.-P.Kriegel and J.Sander, OPTICS: Ordering points to identify the clustering structure, In Proc.1999 ACM-SIGMOD Int.Conf. Management of Data(SIGMOD’99), Philadephia,PA. 1999,49~60
    [44]周水庚,周傲英,一种基于密度的快速聚类算法,计算机研究与发展,2000, 37(11): 1287~1292
    [45] Zhou Shui-geng,Zhou Ao-ying, JIN Wen 等.FDBSCAN:A Fast DBSCAN Algorithm. Journal of Software. 2000.11(6):735~714.
    [46]周水庚,范哗,周傲英,基于数据取样的 DBSCAN 算法,小型微型计算机系统,2000, 21(12): 1270~1274
    [47]Zhou Aoying, Zhou, Shuigeng, Cao Jing,Approaches for scaling DBSCAN algorithm to large spatial databases, Journal of Computer Science and Technology,20 00,15(6):50~52
    [48]周水庚,周傲英,基于数据分区的 DBSCAN 算法,计算机研究与发展,2000, 37(10): 1153~1159
    [49].Jong Sander, Martin Ester, Hans-Peter Kriegel,Xiaowei Xu. Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications. Data Mining and KnowledgeDiscovery, 1998,2(2):169~194
    [50] Alexander Hinneburg, Daniel A. Keim, “An efficient approach to clustering in large multimedia databases with noise”, In Proc. of the 4th International Conference on Knowledge Discovery and Data Mining (KDD'98), New York: AAAI Press, 1998:58~65
    [51] Xin Wang, Howard J.Hamlton. “DBRS: a density-based spatial clustering method with random sampling”. In: Proc. of 7th PAKDD, Seoul, Korea, 2003:563~575
    [52] Shuai Ma, Tengjiao Wang, Shiwei Tang, Dongqing Yang, Jun Gao, “A fast clustering algorithm based on reference and density based on reference and density”, Journal of Software, 2003, 14(6):1089~1092
    [53]赵艳厂,谢帆,一种新的聚类算法:等密度线算法,北京邮电大学学报 2002, 25(2):8~13
    [54] A. Hinneburg, D. A. Keim, An efficient approach to clustering in large multimedia databases with noise.P roc. 1998 Int. Conf.K knowledge Discovery and Data Mining, 1998, 58~65
    [55]Qiu Xiaoping, Tang Yunchuan,Meng Dan,A new fuzzy clustering method based on distance and density,Proceedings of the IEEE International Conference on Systems,Manand Cybernetics,2002,7:86~90
    [56]Cuevas Antonio, Febrero Manuel, Fraiman Ricardo, Cluster analysis: a further approach based on density estimation, Computational Statistics and DataA nalysis, 2001, 36(4):441~459
    [57]Bajcsy P,Ahuja N,Location and density-based hierarchical clustering using similarity analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(9):1011~1015
    [58]QIAN WeiNing, GONG XueQing, ZHOU AoYing, Clustering in Very Large Databases Based on Distance and Density, 2003, 18(1):807~819
    [59] W.Wang, J.yang, R.Muntz. “Sting: a statistical information grid approach to spatial to spatial data mining”. In: Proc. of VLDB’1997, 1997:186~195
    [60] Wei Wang, Jiong Yang, Muntz R., STING+: an approach to active spatial data mining, 15th International Conference on Data Engineering, 1999, 116~125.
    [61]G.Sheikholeslami, S.Chatterjee, and A.Zhang.WaveCluster: A multi-resolution clustering approach for very large spatial databases. Proceedings of the 24rd International Conference on Very Large Data Bases, New York, 1998, 428~439
    [62]R.Agrawal, J.GEhrke, D.Gunopulos, and P.Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, In Proc.1998 ACM-SIGMOD int.Conf.Management of Data(SIGMOD’98),Seattle 1998:94~105
    [63]Zhao Yan chang, Song Junde, GDILC: a grid-based density-isoline clustering algorithm, 2001 International Conference son Info-techand Info-net, 2001, l3:140~145
    [64] Schikuta E., Grid-clustering: an efficient hierarchical clustering method for very large datas ets,Proceedings of the 13th International C onference on Patern Recognition,1996,vol.2:101~105
    [65] Schikuta, E. and Erhart, M.,BANG-clustering: A novel grid-clustering algorithm for huge data sets, Proceedings of the 2nd International Workshop on Statistical Techniques in Pattern Recognition, Sydney, Australia,1998,1451
    [66]D.Fisher, Improving inference through conceptual clustering, In Proc.1987 AAAI Conf.,Seattle,WA, 1987:461~465
    [67]J.Gennari, P.Langley and D.Fisher, Models of incremental concept formation, Artificial Intelligence, 1989, 40: 11~61
    [68] P.C heeseman and J. Stutz., Bayesian classification (AutoClass): Theory and result.Advances in Knowledge Discovery and Data Mining, 1996, 180:153~180
    [69]Pizzuti Clara, Talia Domenico, P-AutoClass: Scalable parallelcl ustering for minin glarge datasets, IEEETrans.on Knowledge and Data Engineering, 2003, 15(3):629~641
    [70] Raymond T.Ng, Jiawei Han., Efficient and Effective Clustering Methods for Spatial Data Mining, In Proceedings International Conference Very Large Data Bases, Santiago, Chile, 1994, 144~155.
    [71]Venkatesh Ganti, Johannes Gehrke,Raghu Ramakrishnan, CACTUS-Clustering Categorical Data Using summaries, Proceedings of ACM SIGKDD, International Conference on Knowledge Discovery & Data Mining, San Diego, CA USA, 1999, 73~83.
    [72]J.W Shavlik, T.G Dietterich, Readings in Machine Learning, San Mateo, CA: Morgan Kaufmann, 1990
    [73]D.E .Rumelhart, D.Zipser, Feature discovery by competitivel earning, Congnitive Science, 1985, 9(1):75~112
    [74]T.Kohonen, Self-organization and associate memory, Berlin: Springer- Verlag, 1984
    [75]T.Kohonen, Improved versions of learning vector quantizaton, International joint Conferenceon Networks, San Diego, 1990, 1:545~550
    [76]Kohonen Teuvo, The self-organizing map, Neuro computing, 1998, 21(1-3):1~6
    [77]T.Kohonen, Self-Organizing Maps, Springer, Berlin, 1995
    [78]D.Choi, S. Park., Self-creating and organizing neural networks, IEEE Trans.on Neural Networks, 1994, 5(4):561~575
    [79]B.Fritzke, Letit Grow-Self Organizing Feature Maps with Problem Dependent Cell Structure Amsterdam, The Netherlands, 1991, 403~408
    [80]B.Fritzke, Growing cell structure: A self organizing network for supervised and un-supervised Teaming, Neural Networks, 1994, 7(10):1441~ 1460
    [81]Fritzke B., Growing Grid a self-organizing network with constant neighborhood range and adaption strength,Neural Process.Let.,1995,1~5
    [82]B.Fritzke, A Growing Neural Gas Network Learns Topologies, A dvancesin Neural In formation Processing Systems, 1995
    [83] J. Bruske,GSommer, Dynamic Cell Structure Learns Perfectly Topology Preserving Map, Neural Computation, 1994, 7(4)
    [84] AlahakoonD .,H algamugeS .K.,S rinivasan,B .,A self-growing c luster development approach to data mining,IEEE International Conference on Sytems, Man,and Cybernetics,1998,3:2901~2906
    [85]D.Alahakoon, S.K.Halgamuge, Dynamic self-organizing maps with controlled growth for knowledge discovery, IEEE Trans.on Neural Networks, 2000, 11(3):601~614.
    [86]汪加才,陈奇,俞瑞钊.一种新的自组织神经网络动态生成算法. 模式识别与人工智能, 2001, 14(3):360~366
    [87] A. Raube, D. Merkl, M. Dittenbach, The Growing Hierarchical Self-Organizing Map: Exploratory Analysis of High-Dimensional Data, IEEE Trans. On neural networks, 2002, 3(6):1331~1340
    [88]Dittenbach Michael, Rauber Andreas, Merkl Dieter, Uncovering hierarchical structure in data using the growing hierarchical self-organizing map, Neuro computing, 2002, 48(1-4):199~216
    [89]TEUVO KOHONEN, OLLI IMULA.Engineering Applications of the Self-Organizing Map. In Proceedings of the IEEE, 1996,84(10):1358~1394.
    [90] Michel Neuhaus and Horst Bunke, Self-Organizing Maps for Learning the Edit Costs in Graph Matching, IEEE TRANSACTIONS ON SYSTEMS, MAN.AND CYBERNETICS –PART B: CYBERNETICS, 2005, 35(3):503~514.
    [91] Hodge V.J., Austin J., Hierarchical growing cell structures: Tree GCS, Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies, 2000, 2:553~556.
    [92] Hodge V.J., Austin J., Hierarchical growing cell structures: Tree GCS, IEEE Transactions on Knowledge and Data Engineering, 2001, 13(2):207~218
    [93]BurzevskiV .,Mohan C.K.,Hierarchical growing cell structures,IEEE International Conference on Neural Networks,1996,3:1658~1663
    [94] Hujun Yin,ViSOM-A Novel Method for Multivariate Data Projection and Structure, IEEE Trans.on neural networks,2002,13(1):237~243
    [95] Cowgill, M.C.Harvey, R.J.Watson L.T, A Genetic Algorithm Approach to Cluster Analysis, Computers&Mathematics with Applications, 1999, 37(7): 99~108
    [96] Mali U., Bandyopadhyay S., Genetic algorithm-based clustering technique, Patten Recognition, 2000, 33(9):1455~1465
    [97] Chiou Y -C, Lan L W, Genetic clustering algorithms, European Journal of Operational Research, 2001, 135(2)
    [98] Estivill Castro V, Murray A., Spatial clustering for data mining with generic algorithms, Technical report FIT-TR-97-10, Queensl and University of Technology, Faculty of In formation Management,1997
    [99] Paolo Corsini, Beatrice Lazzerini,Francesco Marcelloni,A Fuzzy Relational Clustering Algorithm Based on a Dissimilarity Measure Extracted From Data, IEEE Transactions on Systems,Man,and Cybernetics Part B: Cybernetics, 2003
    [100] Jung-Hsien Chiang, Pei-YiHao,A new kernel-based fuzzy clustering approach :support vector clustering with cell growing, IEEE Transactions on Fuzzy Systems,2003,11(4):518~527
    [101] Eschrich S., Jingwei Ke, Hall L.O. etc,Fast accurate fuzzy clustering through data reduction, IEEE Transactions on Fuzzy Systems, 2003, 11(2):262~270 [102〕何清,模糊聚类分析理论与应用研究进展,模糊系统与数学,1998,12(2): 89~94
    [103] Daniel B.A., Ping Chen, Using Self-Similarity to Cluster Large Data Sets,Data Mining and Knowledge Discovery,2003,7(2):123~152
    [104]钱卫宁,周傲英,从多角度分析现有聚类算法,软件学报,2002, 13(8):1382 ~1394
    [105] Dubes R., Jain A.K., Validity studies in clustering methodologies, UK, Pergamon Press, 1979
    [106] Zait, M., Messatfa H., A comparative study of clustering methods.Future Generation Computer System, 1997, 13:149~159
    [107] 王莉王正欧, TGSOM:一种用于数据聚类的动态自组织映射神经网络, 电子与信息学报, 2003, 25(3):313~319.
    [108]Wang Li, Wang Zheng-ou, CUBN: A Clustering algorithm based on densityand distance, In Proceedings of 2003 International Conferenceon Machine Learning and Cybernetics, Xi'an IEEE, 2003, 108~112
    [109] Anthony K.H.Tung, Raymond T.Ng, Laks V.S.Lakshmanan, Jiawei Han, Constraint-based Clustering in Large Databases, ICDT, 2001, 405~419
    [110]Anthony K.H.Tung, ean Hou, JiaWei Han. Spatial Clustering in the Presence of Obstacles, Intl.Conf.On Data Engineering.2001:359~367
    [111] V.Estivill-Castro, I.Lee, AUTOCLUST+: Automatic Clustering of point-data sets in the presence of obstacles. Intl.Conf.On workshop on Temporal, Spatial and Spatio-Temporal Data Mining.2000, 133~146
    [112] Osmar R.Zaiane, Chi-Hoon Lee, Clustering Spatial Data When Facing Physical Constraints, IEEE International Conf on Data Mining, 2002, 737~741
    [113] XinWang, Cailo Rostoker and Howard J.Hamilton, Density-Based Spatial Clustering in the Presence of Obstacles and Facilitators, PAKDD, 2004, 1~15
    [114] Vladimir Estivill-Castro, Lckjai Lee, Fast Spatial Clustering with Different Metrics and in the presence of obstacles, Proceedings of the ninth ACM international symposium on Advances in geographic information systems, ACM Press New York, NY, USA, 2001,142~147
    [115] Martin Ester, Hans-Peter Kriegel, Jorg Sander, Algorithms and applications for spatial data mining, Geographic Data Mining and Knowledge Discovery, 2001,5(6)
    [116] C.-H.Lee and O.R.Zaiane, Polygon reduction: An algorithm for minimum line representation for polygons. In Submitted to 14th Canadian Conf. on Computational Geometry, 2002
    [117]周培德,计算几何算法分析与设计,北京,清华大学出版社,2000
    [118] C.-H. Lee and O. R. Zarane, Polygon reduction: An algorithm for minimum line representation for polygons, In Submitted to 14th Canadian Conf. on Computational Geometry, 2002
    [119] 李德仁,史文中,论空间数据挖掘和知识发现,武汉大学学报:信息科学版,2001,26(6):491—499
    [120]李德仁,李德毅,空间数据挖掘,清华大学出版社,2006
    [121]Murray A T, Estivill-castm V. Clustering Discovery Techniques for Exploratory Spatial Data Analysis, International Journal of Geographical Information Science,1998,12(5):431~443
    [122]Kopeck.K, Adhikary J, Han J. Spatial Data Mining: Process and Challenges Survey Paper. SIGMOD’96 Workshop on Research Issues on Data Mining and Knowledge Discovery, Montreal, Canada, 1996
    [123]孙宇清,赵锐,姚青等,一种基于网格的障碍约束下空间聚类算法,山东大学学报(工学版),2006,36(3):86~90

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700