用户名: 密码: 验证码:
氟石粉液化建模中的特征选择方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
氟石粉在海运途中因运输水分含量过高会发生液化从而发生沉船事故。随着越来越多此类沉船事故的发生,人们意识到探讨氟石粉液化成因的重要性。在国内外对氟石粉液化研究较少的情况下,本文从考虑所有可能影响氟石粉液化的因素出发,利用特征选择方法筛选出影响氟石粉液化的主要因素并建立相应液化模型,为进一步机理模型建立及氟石粉厂家生产提供参考。主要工作如下:
     (1)为了寻找主要影响氟石粉液化的因素,先考虑了所有可能影响氟石粉液化的因素,按照相关实验标准进行实验并采集实验数据。总共采集了196个样品的20个属性的有效实验数据。
     (2)提出了基于回归预测误差的异常样本逐次剔除方法。该算法相比于一次性删除所有异常样品的剔除方法,给了被鉴定为异常样品二次确认的机会,这样会在很大程度上避免将一些正常样品误当作异常样品删除,同时又能达到提高回归预测模型精度的效果。
     (3)提出了基于回归预测误差和遗传算法集成的特征选择方法。该算法将所有的属性编码为一个遗传个体,利用回归预测模型的预测误差及个体的属性个数来评价该个体的适应度,通过选择、交叉及变异过程,不断繁殖与迭代,最终会收敛到一个最优的个体,此个体所包含的属性即为最优属性集合。同时在遗传算法中,提出了综合预测误差及属性个数的适应度函数确定方法,在选择算子与变异算子中引入了模拟退火算法的思想,使选择算子与变异算子得到改善,更有利于算法的寻优,加强了算法的全局搜索能力。
     (4)利用本文所提出的特征选择方法对实验数据进行分析得到包含8个属性的最优属性集,并对这8个属性在回归预测模型上进行灵敏度分析,指出这8个属性对氟石粉液化的影响。
Fluoride powder containing large amounts of water is inclined to be liquefied, which will lead to the shipping wreck. As more and more accidents happen, researchers begin to investigate the causes of fluoride powder liquefaction. In the case of the few studies on the fluoride powder liquefaction both in research and industry domains, this thesis takes the initiate to study this topic. Firstly, the thesis considers all possible factors that affect the liquefaction of the fluoride powder. Then the feature selection method for selecting the main factors of the fluoride powder liquefaction is proposed, and an intelligent model of fluoride powder liquefaction is established. The main contributions of this thesis are provided as follows,
     (1) In order to find the main factors that affect the fluoride powder liquefaction, the thesis considers all possible factors that affect the liquefaction of fluorine powder. Some experiments under the relevant experimental standard are conducted with collecting 196 samples, each of which contains 20 properties for further investigation.
     (2) The progressive abnormal sample deletion method based on regression forecasting error is presented. Compared to the conventional abnormal sample deletion methods, the proposed algorithm gives the samples identified as abnormality a second testing opportunity, which could avoid the incorrect deletion of some normal samples as abnormal in large part. Meanwhile, this algorithm can dramatically improve the accuracy of regression forecasting model results.
     (3) The feature selection method based on regression forecasting error and genetic algorithm is presented. Processes of the algorithm are described as follows: Firstly, all of the properties of the fluoride powder liquefaction are encoded as a genetic entity. Secondly, this thesis evaluates the fitness of the individual using a fitness function based on regression forecasting error and the number of the properties belonging to this individual. Finally, an optimal individual will be selected through the repeated processes of selection, crossover and mutation. The properties of this individual are the prominent properties which affect the fluoride powder liquefaction. In the genetic algorithm, the selection and mutation operators are improved by introducing the simulated annealing algorithm, which enhances the global searching capabilities of the genetic algorithm.
     (4) With the feature selection methods proposed in this thesis for the analysis of experimental data, the optimal attribute set containing eight properties is obtained. Furthermore, a regression forecasting model is presented for the sensitivity analysis by using aforementioned eight properties data.
引文
[1]袁俊宏.我国萤石资源开发利用情况[J].有机氟工业.2005(2):27-29
    [2]张方.我国萤石资源与氟化工产业发展形势分析[J].化学工业.2008(7):6-9
    [3]MSC 79/23/Add.4.Code of Safe Practice for Solid Bulk Cargoes,2004
    [4]Benjanmin Gallagher,Stephanie Stogsdill,Richard W.Stephenson.Influence of Ore Physical Properties on the Transportable Moisture Limit for Barged Materials.http://utc.mst.edu/research/r156.html
    [5]沈伦田.氟石粉安全运输的探讨[J].航海技术.1996(2):15~16
    [6]梁诚.氟石资源保护及可持续发展分析[J].有机氟工业.2006(3):46-51
    [7]吕惠进.我国萤石矿产资源可持续开发利用研究[J].矿业研究与开发.2005(2):6-9
    [8]Anthony T C.Seismic liquefaction potential assessed by neural networks[J].Journal of Geotechnical Engineering.1994,120(9):1467-1480.
    [9]Anthony T C.Back propagation approach for predicting seismic liquefaction potential in soil[R].IEEE International Conference on Neural Network-conference Proceeding,,1994(5),3322-3325
    [10]Anthony T C.Neural-network modeling of CPT seismic liquefaction data[J].Journal of Geotechnical Engineering,1996(1),70-73.
    [11]Samson S.C and Daniele V.Regression models for evaluating liquefaction probability[J].Geotechnical Engineering Division,ASCE 114,4(1988)389-411
    [12]周瑞林,刘燕,赵胜利.基于RBF神经网络的砂土液化预测[J].河南大学学报(自然科学版),2005,35(4):101-104.
    [13]周仲景,熊传祥.基于支持向量机的砂土地震液化判别模型[J].岩土工程界,2006,9(12):74-76.
    [14]潘健,刘利艳,林慧常.基于BP神经网络的砂土液化影响因素的综合评估(Integrated Evaluation of Factors to Affect Liquefaction of Sandy Soil Based on BP Neural Network)[J].华南理工大学学报(自然科学版),2006,34(11):76-80.
    [15]刘勇国,李学明,张伟等.基于遗传算法的特征子集选择[J].计算机工程,2003,4(29):19-21
    [16]YongSeog Kim.Feature select in supervised and unsupervised learning via evolutionary search.[Ph.D.dissertation].Iowa:University of lowa,2001
    [17]Z.Pawlak.Rough sets[J].International Journal of Computer and Information Science,1982,11(5):341-356
    [18]王国胤.Rough集理论与知识获取[M].西安:西安交通大学出版社,2001.
    [19]Zhai Lian-Yin,Khoo Li-Pheng,Fok Sai-Cheong.Feature extraction using rough set theory and genetic algorithms-An application for the simplification of product quality evaluation[J].Computers and Industrial Engineering,2002,43(4):661-676.
    [20]姚望舒,商琳,陈兆乾.一种基于进化算法的连续属性离散化方法[J].计算机应用与软件,2005,22(3):37-39,85
    [21]张伟.基于双向距离关联和径向基神经网络的属性约简算法研究[硕士学位论文].太原科技大学.
    [22]任江涛,黄焕宇,孙婧昊,印鉴.基于相关性分析及遗传算法的高维数据特征提取[J].计算机应用.2006(26):1403—1405.
    [23]张晓东.支持向量机在肺癌生存期预测中的应用分析[J].计算机工程与应用.2007,43(18):196-198
    [24]栗庆吉.数据挖掘在药品配方研制中的应用[硕士学位论文].东北师范大学.2006.5.
    [25]王鹏.基于数据挖掘的洪峰预测系统的研究与实现[硕士学位论文].河海大学,2006.6.
    [26]王海青,宋执环,李平.采用正交小波网络的非线性系统辨识方法[J].控制理论与应用,2001,18(2):200-204.
    [27]史忠植.知识发现[M].北京:清华大学出版社,2002.
    [28]Jiawei Han.Data Mining:Concepts and Techniques.BeiJing[M]:China Machine Press.2006.
    [29]高玉蓉.基于决策树的土地利用现状信息提取研究[硕士学位论文].浙江大学,2006.
    [30]Vapnik V N.统计学习理论的本质[M].张学工.北京:清华大学出版社,2000.
    [31]杜树新,吴铁军.于回归估计的支持向量机方法.系统仿真学报,2003.15(11):1580-1585.
    [32]孙增圻,张再兴,邓志东.智能控制理论与技术[M].北京:清华大学出版社,2004.2.
    [33]王建梅,覃文忠.基于L-M算法的BP神经网络分类器[J].武汉大学学报(信息科学版),2005,30(10):928-931.
    [34]卢金秋.人工神经网络在海关风险管理中的应用研究[J].计算机工程与应用,2006,27:208-211.
    [35]黄江华.人工神经网络在数据挖掘中的应用[硕士学位论文].中南大学,2006.11.
    [36]毛国君,段立,王石等.数据挖据原理与算法[M].北京:清华大学出版社,2005.
    [37]徐丽娜.神经网络控制[M].哈尔滨:哈尔滨工业大学出版社,2002
    [38]葛哲学,孙志强.神经网络理论与MATLAB R2007实现[M]北京:电子工业出版社,2007.
    [39]任文杰.人工神经网络在地基土液化判别及等级评价中的应用[硕士学位论文].河北工业大学,2002.1.
    [40]薛新华.人工神经网络在地基土液化判别中的作用[硕士学位论文]. 中国海洋大学,2004,6
    [41]任红梅,吕西林,李培振.饱和砂土液化研究进展.地震工程与工程振动.2007(12):166-175
    [42]李立云,崔杰,景立平,杜修力.饱和粉土振动液化分析.岩土力学.2005(10):1663-1666
    [43]曾长女.细粒含量对粉土液化及液化后影响的试验研究[博士论文].南京:河海大学.2006.
    [44]闵顺耕,李宁,张明祥.近红外光谱分析中异常值的判别与定量模型优化.光谱学与光谱分析,2004.10(24):1205-1209.
    [45]陈斌,邹贤勇,朱文静.PCA结合马氏距离法剔除近红外异常样品.江苏大学学报(自然科学版),2008.7(29):277-279.
    [46]祝诗平,王一鸣,张小超,吴静珠.近红外光谱建模异常样品剔除准则与方法.农业机械学报,2004.7(35):115-119.
    [47]David M.Harland,Edward V.Thomas.Partial least-squares methods for spectral analyses.1.Relation to other quantitative calibration methods and the extraction of qualitative information[J].Analytical Chemistry,1988,60(11):1193-1202.
    [48]戴文华.基于遗传算法的文本分类及聚类研究[M].北京:科学出版社,2008.8.
    [49]雷英杰,张善文,李续武等.MATLAB遗传算法工具箱及应用[M].西安:西安电子科技大学出版社,2005,4.
    [50]许禄,邵学广.化学计量学方法[M].北京:科学出版社,2006.2.
    [51]莫鸿强,罗飞,毛宗源.遗传算法模式处理能力分析及其在参数编码中的应用.中国第13届“控制与决策”年会,2001,5.
    [52]盛文峰.面向数据挖掘的遗传算法的研究与应用[硕士学位论文].上海交通大学,2007,2.
    [53]刘勇国,李学明,张伟等.基于遗传算法的特征子集选择[J].计算机工程.2003.4(29):19-21.
    [54]Zurada J M,Malinowski A,Usui S.Perturbation method for deleting redundant inputs of perceptron networks[J].Neurocomputing,1997.14:177-193.
    [55]费芸洁.基于灵敏度分析的神经网络结构优化方法研究[硕士学位论文].苏州大学,2007.
    [56]费芸洁,邓伟.一种基于灵敏度分析的神经网络剪枝方法[J].计算机工程与应用,2007,43(7):34,35,51

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700