用户名: 密码: 验证码:
急性髓细胞白血病基因筛选模型的贝叶斯分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Bayesian Statistical Analysis of Acute Myeloid Leukemia Gene Expression Data
  • 作者:肖颖 ; 刘金山
  • 英文作者:XIAO Ying;LIU Jin-shan;Department of Applied Mathematics, College of Mathematics and Informatics, South China Agricultural University;School of Financial Mathematics & Statistics, Guangdong University of Finance;
  • 关键词:分层混合模型 ; 急性髓细胞白血病 ; 基因表达 ; MCMC算法
  • 英文关键词:hierarchical mixture model;;acute myeloid leukemia;;gene expression;;MCMC algorithm
  • 中文刊名:数学的实践与认识
  • 英文刊名:Mathematics in Practice and Theory
  • 机构:华南农业大学数学与信息学院应用数学系;广东金融学院金融数学与统计学院;
  • 出版日期:2019-02-08
  • 出版单位:数学的实践与认识
  • 年:2019
  • 期:03
  • 基金:国家自然科学基金项目(11171117);; 广东省自然科学基金项目(S2011010002371)
  • 语种:中文;
  • 页:159-167
  • 页数:9
  • CN:11-2018/O1
  • ISSN:1000-0984
  • 分类号:R733.71
摘要
基因表达数据蕴含着大量的生物信息,在生物基因信息研究中,筛选表达水平发生显著变化的差异基因是认识疾病形成机理和辅助靶点药物研究的关键问题.根据急性髓细胞白血病(AML)的基因表达数据,构造基因均值差序列,建立贝叶斯分层混合模型,并为模型的参数赋予具有基因生物特征的先验信息.采用马尔可夫链蒙特卡洛(MCMC)算法对模型参数进行估计,并筛选出急性髓细胞白血病差异表达基因.在实际数据分析中,从美国生物信息中心(NCBI)的高通量基因表达数据库中获取急性髓细胞白血病基因数据集,从经过非特异滤波预处理的14688个急性髓细胞白血病基因中筛选出711个差异表达基因,差异表达基因数仅占急性髓细胞白血病基因总数的4.84%,这一结果与基因差异表达的生物学原理相吻合.
        Based on the fact that gene expression data includes lots of biological message,detecting differential expressed genes can make significance sense to help learn more about the diseases and the discovery of new drugs. In this paper, a Bayesian Hierarchical Normal Mixture model is constructed to detect differential expressed genes of acute myeloid leukemia,with fix components of three. Specific priors are introduced into the model, which are in some sense reflecting the biological characters of genes and make the model more practical. The parameters are estimated via the Markov Chain Monte Carlo(MCMC) method. A set of data from the National Center for Biotechnology Information in USA is analyzed. Result shows that 711 of the 14688 acute myeloid leukemia genes are differential expressed. That is to say,the number of differential expressed genes account for 4.84% of the total number of genes.The results are in consistent with the biological principle, i.e., most genes are not differential expressed.
引文
[1]应嘉,赵睿颖,尚彤.生物信息学在人类基因组计划中的应用[J].北京大学学报,2002, 34(4):389-392.
    [2] Chen Y, Dougherty E, et al., Ratio-based decisions and the quantitative analysis of cDNA microarray images[J]. Biomed Opt, 1997, 2:364-374.
    [3] Cui X. et al, Improved statistical tests for differential gene expression by shrinking variance components estimates[J]. Biostatistics, 2005, 6:59-75.
    [4] Raphael, Gottardo et al., A Flexible and powerful Bayesian hierarchical model for ChIP-Chip Experiments[J]. Biometrics, 2007, 64:468-478.
    [5]蒋定锋,潘娟娟,赵耐青.差异表达基因筛选方法的比较[J].中国卫生统计,2006, 23(5):417-420.
    [6] Newton M A. et al., On differential variability of expression ratios:improving statistical inference about gene expression changes from microarray data[J]. Comput Biol, 2001, 8:37-52.
    [7] Kendziorski C M. et al., On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles[J]. Statist Med, 2003, 22:3899-3914.
    [8]Lonnstedt I,and Speed T. Replicated microarray data[J]. Statist, Sinica, 2002, 12:31-46.
    [9] Lo K, and Gottardo R. Flexible empirical Bayes models for differential gene expression[J]. Bioinformatics, 2003, 23(3):328-335.
    [10] Hong Z P, and Lian H. A Bayesian hierarchical model for outlier expression detection[J]. Computational Statistics and Data Analysis, 2012, 56:4146-4156.
    [11]阳洁,江庭秀,陈宏.急性髓系白血病17种基因异常的检测[J].现代肿瘤医学,2014, 22(12):2955-2958.
    [12] Erin, et al., A Bayesian mixture model for metaanalysis of microarray studies[J]. Funct Interger Genomics, 2008, 8:43-53.
    [13] Richardson S,Green P J. On Bayesian analysis of mixtures with an unknown number of components[J]. Journal of the Royal Statistical Society, Series B, 1997, 59(4):731-792.
    [14]曹诗若,苏宇楠,田茂再.基于分层线性模型的贝叶斯推断及其应用[J].统计与决策, 2015(03):4-8.
    [15]曾平,王婷,黄水平,赵华硕.定性临床试验资料meta分析的经验贝叶斯模型原理和应用[J].中国卫生统计,2012, 29(05):657-660.
    [16]王杨,王睿,陈涛,李卫.贝叶斯分层模型在医疗器械临床试验中的应用[J].中华疾病控制杂志,2012,16(03):254-256.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700