摘要
为深度剖析饮用水硝酸盐与癌症死亡率之间的潜在相关性,主要利用K-means算法,根据癌症的四个主要风险因素(吸烟、饮酒、糖尿病和肥胖)和预期寿命,对美国各县的数据进行分组,利用计算机进行仿真分析,探讨各项风险因素对癌症死亡率的直接或间接影响。实验结果表明,饮用水硝酸盐与癌症总死亡率在考虑预期寿命的情况下是成正相关的,结果符合医学研究者的预期。上述研究对我国环境流行病数据的收集和分析具有较为重大的意义。
To explore the underlying correlation between nitrate in drinking water and cancer mortality, based on four major risk factors for cancer(smoking, drinking, diabetes, and obesity) and life expectancy, we used the K-means algorithm to group the data of various counties in the United States, and used computer to carried out the simulation to investigate the direct or indirect effects of various risk factors on cancer mortality. The experimental results show that nitrate in drinking water is positively correlated with total cancer mortality taking into account life expectancy, which is in line with the expectation of medical researchers. This study is of great significance to the collection and analysis of environmental epidemiological data in China.
引文
[1] Benjamin Schelling,Claudia Plant.KMN-Removing Noise from K-Means Clustering Results[C].International Conference on Big Data Analytics and Knowledge Discovery.Regensburg:Springer,2018:137-151.
[2] V Gajera,etc.An effective Multi-Objective task scheduling algorithm using Min-Max normalization in cloud computing[C].International Conference on Applied & Theoretical computing & Communication Technology.Bangalore:IEEE,2017:812-816.
[3] Feng Wanli,etc.An expert recommendation algorithm based on Pearson correlation coefficient and FP-growth[J].Cluster Computing,2018.1-12.
[4] Kevin O'Hare,Anna Jurek,Cassiode Campos.A new technique of selecting an optimal blocking method for better record linkage[J].Information Systems,2018.151-166.
[5] Umberto Benedetto,etc.Statistical primer:propensity score matching and its alternatives[J].European Journal of Cardio-thoracic Surgery,2018:1112-1117.
[6] Li Zhanjiang.Establishment of Evaluation Index System of Credit State of Micro Enterprises[J].Technology Economics,2017.
[7] V Kishore Attadevara.Pro Machine Learning Algorithms[M].Berkeley:Apress,2018.
[8] S B Choi,etc.Ten-year prediction of suicide death using Cox regression and machine learning in a nationwide retrospective cohort study in South Korea[J].Journal of Affect Disorders,2018.8-14.
[9] Lloyd Allison.Coding Ockham's Razor[M].Springer,2018.
[10] Asmaa Elbadrawy,R Scott Studham,George Karypis.Collaborative multi-regression models for predicting students' performance in course activities[C].The Fifth International Conference on Learning Analytics And Knowledge.New York:ACM,2015:103-107.
[11] Zeev B Alfassi,Zvi Boger,Yigal Ronen.Significance Test[M].Blackwell Publishing Ltd.,2009.