摘要
在很多领域利用机器学习的方法对数据进行分析、预测、判断具有非常重要的现实意义。将机器学习的算法应用到医学领域成为了研究的热点之一。糖尿病是多发病症,对是否患有糖尿病做出有效预测,意义重大。论文采用机器学习算法预测糖尿病,利用微软的Azure machine learning作为实验平台。采用了神经网络、逻辑回归、决策树、贝叶斯、支持向量机五种机器学习算法进行了预测,预测正确率分别是0.854,0.787,0.952,0.779,0.781。结果显示决策树预测效果最佳。在决策树预测的基础上对预测方法做出改进后,实验结果表明正确率提高了0.002。
Using machine learning method to analyze,predict and judge in some fields,It is of great significance.The application of machine learning in medical field has become a hot topic of research. Diabetes is a common disease. It is of great significance to make effective prediction of diabetes. Machine learning algorithm was used to predict diabetes. Using Microsoft Azure machine learning as the experimental platform,a data set with 15000 samples was chosen. Each sample has 11 feature points,and 70% samples were used as training set and 30% as a test set. Neural network,logical regression,decision tree,Bayes,support vector machine were used to predict and the accuracy of prediction is 0.854,0.787,0.952,0.779,0.781 respectively. The prediction results show that the decision tree prediction is better. After further improvement of the prediction method,the experimental results show that the accuracy rate is increased by 0.002.
引文
[1]张润,王永滨.机器学习及其算法和发展研究[J].中国传媒大学学报(自然科学版),2016,23(2):10-18.
[2]余明华,冯翔,祝智庭.人工智能视域下机器学习的教育应用与创新探索[J].远程教育杂志,2017,35(3):11-21.
[3]Peter Flach.机器学习[M].北京:人民邮电出版社,2016:9-10.
[4]孙存一,龚六堂.大数据思维下的利率定价研究———以机器学习为视角的实证分析[J].金融理论与实践,2017(7):1-5.
[5]张郴,黄震方,张捷,等.基于机器学习的南京市旅游地个性及其文化景观表征[J].地理学报,2017,72(10):1886-1903.
[6]詹菊红,蒋跃.机器学习算法在翻译风格研究中的应用[J].外语教学,2017,38(5):80-85.
[7]强玲娟,常安定,陈玉雪.机器学习算法反求水文地质参数[J].煤田地质与勘探,2017,45(3):87-90.
[8]世界卫生组织.全球糖尿病报告[DB/OL]. http://www.who.int/diabetes/zh/.
[9]G Luo.Automatically explaining machine learning prediction results:a demonstration on type 2 diabetes risk prediction[J]. Health Information Science&Systems,2016,4(1):1-9.
[10]Hsin Yi T,PeiYing C,ChiaYu S E. Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms[J]. BMC Bioinformatics,2018,19(S9):195-205.
[11]T Zheng,W Xie,L Xu,et al. A machine learning-based framework to identify type 2 diabetes through electronic health records[J].International Journal of Medical Informatics,2017,97:120-127.
[12]N Yuvaraj,KR Sripreethaa.Diabetes prediction in healthcare systems using machine learning algorithms on Hadoop cluster[J].Cluster Computing,2017(1):1-9.
[13]苏萍,杨亚超,杨洋,等.健康管理人群2型糖尿病病发风险预测模型[J].山东大学学报(医学版),2017,55(6):82-86.
[14]周志华.机器学习[M].北京:清华大学出版社,2016:73-74.
[15]千贺大司,山本和贵,大泽文孝.微软Azure机器学习实战手册[M].北京:中国人民大学出版社,2017.