用户名: 密码: 验证码:
人类基因组分析中的缺失偏倚效应研究和拷贝数变异的突变估计
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
技术的革命将遗传学的研究引入了组学的时代,通过芯片技术的运用产生了大量的遗传数据。为了深入地对数据进行挖掘,其他学科,如统计学、信息学等,与遗传学的结合越来越紧密。本文就五年来本人攻读博士学位期间的工作进行了总结,期望通过对两部分工作的介绍,展示统计学运用于解决遗传学问题的实例。
     其一,我们对高通量单核苷酸多态分型平台的缺失偏倚现象及其对后续分析的影响进行了探究。高通量、低成本的分型平台的出现,使全基因组关联分析成为了可能。但是研究者往往将关注的目标及于如何提高分型的准确率,而忽视了另一质量问题——缺失数据的存在。为了研究缺失现象对全基因组关联分析的影响,我们对四个主流分型平台(TaqMan(?) SNP分型平台、GenomeLabTM SNPstream分型平台、BeadLab (Illumina)分型平台和Human Mapping 500K (Affymetrix)芯片)的缺失数据进行了重测序分型,实验证实了缺失偏倚现象在多个平台中均普遍存在。进而,我们从理论上分析了缺失偏倚对后续分析的影响,如等位基因/基因型频率的估计、哈迪——温伯格平衡检验和不同疾病模型下关联分析统计功效的影响等。研究显示,缺失偏倚往往导致关联分析统计功效的下降,而且这种下降通常要比单纯的样本缺失造成的影响严重。我们还分别比较了缺失偏倚、分型错误对频率估计、关联分析的影响。通过分析获知,大多数情况下因为分型质量问题造成的分析偏差可以通过提高分型响应度,即使会牺牲一定的分型准确率来尽可能避免、减小。这一发现提示我们过去通常对处于分型边界的读点进行不判读的做法需要被修正。如果是为了降低分析偏差,在全基因组关联分析中,分型响应度和错误率的筛选标准要互相配合。我们建议修改现行的质量控制标准,可以适当增加响应度的阈值而降低对分型准确率的要求。
     其二,我们提出了近似估计拷贝数变异突变率的统计新方法。人类基因组中存在着拷贝数变异,而且这种变异和孟德尔遗传疾病、复杂疾病以及进化中的基因组可塑性相关。为了更好的理解拷贝数变异相关性状的成因,研究拷贝数变异的生成机制、估计它的突变率是十分重要的。多项用于揭示拷贝数变异成因的研究已经开展起来;但是从基因组水平对拷贝数变异突变率进行实验估计还是一个不现实的问题,它需要大量的样本量和精确的分型技术。本研究提出了一种可以运用群体基因型数据对拷贝数变异突变率进行近似估计的方法。这一估计可以通过基因组中不同拷贝数变异的比较,找寻到拷贝数变异的突变热点。运用该方法我们分析了来自HapMap计划的三个群体、4,330个拷贝数变异位点,发现大多数的拷贝数变异突变率大致在10-5/代水平,这与分子实验观察到的零星突变率估计相一致。值得一提的是,有132(3.0%)个拷贝数变异的突变率可达10-3/代水平,被认为是突变热点。进一步的分析发现,基因组结构和重排机制的不同可能造成了人类基因组中拷贝数变异热点的存在。
     在不久的将来,由二代测序技术产生的海量数据将不断地涌现出来,许多悬而未决的遗传问题有望获得解决。对这些数据的挖掘工作离不开统计学、信息学等的运用,让我们做好准备迎接生命科学发展的这一黄金时代的到来。
The technological revolution makes genetics into a new era called'-omics'. A large number of genetic data have been produced through the application of the microarray technology. In order to carry out in-depth data mining, methods from other fields, e.g. statistics and informatics, have been applied into the study of genetics. My Ph.D. work was summarized by two parts in this thesis, which illustrated how statistics is applied into the genetic researches.
     Ⅰ. We investigated missing call bias in high-throughput genotyping and its effects on further analyses. The advent of high-throughput and cost-effective genotyping platforms made genome-wide association (GWA) studies a reality. While the primary focus has been invested upon the improvement of reducing genotyping error, the problems associated with missing calls are largely overlooked. To probe into the effect of missing calls on GWAs, we demonstrated experimentally the prevalence and severity of the problem of missing call bias (MCB) in four genotyping technologies (TaqMan, SNPstream, Illumina Beadlab and Affymetrix Human Mapping 500K SNP array). Subsequently, we showed theoretically that MCB leads to biased conclusions in the subsequent analyses, including estimation of allele/genotype frequencies, the measurement of HWE and association tests under various modes of inheritance relationships. We showed that MCB usually leads to power loss in association tests, and such power change is greater than what could be achieved by equivalent reduction of sample size unbiasedly. We also compared the bias in allele frequency estimation and in association tests introduced by MCB with those by genotyping errors. Our results illustrated that in most cases, the bias can be greatly reduced by increasing the call-rate at the cost of genotyping error rate. The commonly used 'no-call'procedure for the observations of borderline quality should be modified. If the objective is to minimize the bias, the cut-off for call-rate and that for genotyping error rate should be properly coupled in GWA. We suggested that the ongoing QC cut-off for call-rate should be increased, while the cut-off for genotyping error rate can be reduced properly.
     Ⅱ. We proposed a novel statistical method to approximately estimate the mutation rate of copy number variants (CNVs). CNVs in the human genome were found to be contributing to both Mendelian and complex traits as well as genomic plasticity in evolution. The investigation of mutational mechanisms of CNVs and estimating their mutation rates are critical to understanding the etiology of the CNV-associated traits. Much progress has been made to unravel the mechanisms for CNV formation; however, the evaluation of their mutation rates at genome level poses an insurmountable practical challenge which requires large sample size and accurate typing. In this study, we showed that an approximate estimation of the mutation rates at CNVs could be achieved using population genotyping data. This estimation is sufficient to allow a comparison of mutation rates between CNVs across the genome for the purpose of identifying mutational hotspots. In the analysis of 4,330 CNVs from HapMap populations, we showed that the mutation rates of most CNVs are approximately at the order of 10-5 per generation, which is consistent with the observations in molecular assays. Notably, the mutation rates of 132 (3.0%) CNVs are at the order of 10-3 per generation, therefore, identified as hotspots. Further analysis revealed that the differences in genome architecture and rearrangement mechanism likely incited CNV hotspots in the human genome.
     In the near future, masses of data produced by next generation sequencing, will be emerge out. It is likely to unravel many unknowns about genetics. The analysis of such amounts of data relies on the assistance of statistics and informatics. Let us prepare for the advent of this golden time of biology.
引文
1. Collins FS, Guyer MS, Charkravarti A:Variations on a theme:cataloging human DNA sequence variation. Science 1997,278(5343):1580-1581.
    2. Hirschhorn JN, Daly MJ:Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 2005,6(2):95-108.
    3. Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, Ballinger DG, Struewing JP, Morrison J, Field H, Luben R et al: Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 2007,447(7148):1087-1093.
    4. Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder 5. Wang Z, Welch R, Hutchinson A et al: A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 2007,39(7):870-874.
    5. Consortium TWTCC:Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 2007, 447(7145):661-678.
    6. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W et al: Initial sequencing and analysis of the human genome. Nature 2001,409(6822):860-921.
    7. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA et al: The sequence of the human genome. Science 2001,291(5507):1304-1351.
    8. Consortium TIH:The International HapMap Project. Nature 2003, 426(6968):789-796.
    9. Consortium TIH:A haplotype map of the human genome. Nature 2005, 437(7063):1299-1320.
    10. Consortium TIH:A second generation human haplotype map of over 3.1 million SNPs. Nature 2007,449(7164):851-861.
    11. Livak KJ:Allelic discrimination using fluorogenic probes and the 5' nuclease assay. Genet Anal 1999,14(5-6):143-149.
    12. Matsuzaki H, Dong S, Loi H, Di X, Liu G, Hubbell E, Law J, Berntsen T, Chadha M, Hui H et al: Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays. Nat Methods 2004, 1(2):109-111.
    13. Bell PA, Chaturvedi S, Gelfand CA, Huang CY, Kochersperger M, Kopla R, Modica F, Pohl M, Varde S, Zhao R et al: SNPstream UHT:ultra-high throughput SNP genotyping for pharmacogenomics and drug discovery. Biotechniques 2002, Suppl:70-72,74,76-77.
    14. Fan JB, Oliphant A, Shen R, Kermani BG, Garcia F, Gunderson KL, Hansen M, Steemers F, Butler SL, Deloukas P et al:Highly parallel SNP genotyping. Cold Spring Harb Symp Quant Biol 2003,68:69-78.
    15. Mein CA, Barratt BJ, Dunn MG, Siegmund T, Smith AN, Esposito L, Nutland S, Stevens HE, Wilson AJ, Phillips MS et al: Evaluation of single nucleotide polymorphism typing with invader on PCR amplicons and its automation. Genome Res 2000,10(3):330-343.
    16. Faruqi AF, Hosono S, Driscoll MD, Dean FB, Alsmadi O, Bandaru R, Kumar G, Grimwade B, Zong Q, Sun Z et al: High-throughput genotyping of single nucleotide polymorphisms with rolling circle amplification. BMC Genomics 2001,2(1):4.
    17. Ott J:Linkage analysis with misclassification at one locus. Clin Genet 1977, 12(2):119-124.
    18. Abecasis GR, Cherny SS, Cardon LR:The impact of genotyping error on family-based analysis of quantitative traits. Eur J Hum Genet 2001, 9(2):130-134.
    19. Knapp M, Becker T:Impact of genotyping errors on type Ⅰ error rate of the haplotype-sharing transmission/disequilibrium test (HS-TDT). Am J Hum Genet 2004,74(3):589-591; author reply 591-583.
    20. Akey JM, Zhang K, Xiong M, Doris P, Jin L:The effect that genotyping errors have on the robustness of common linkage-disequilibrium measures. Am J Hum Genet 2001,68(6):1447-1456.
    21. Kirk KM, Cardon LR:The impact of genotyping error on haplotype reconstruction and frequency estimation. Eur J Hum Genet 2002, 10(10):616-622.
    22. Liu W, Zhao W, Chase GA:The impact of missing and erroneous genotypes on tagging SNP selection and power of subsequent association tests. Hum Hered 2006,61(1):31-44.
    23. Gordon D, Finch SJ, Nothnagel M, Ott J:Power and sample size calculations for case-control genetic association tests when errors are present:application to single nucleotide polymorphisms. Hum Hered 2002, 54(1):22-33.
    24. Ritchie MD, Hahn LW, Moore JH:Power of multifactor dimensionality reduction for detecting gene-gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Genet Epidemiol 2003,24(2):150-157.
    25. Ahn K, Haynes C, Kim W, Fleur RS, Gordon D, Finch SJ:The effects of SNP genotyping errors on the power of the Cochran-Armitage linear trend test for case/control association studies. Ann Hum Genet 2007,71(Pt 2):249-261.
    26. Ewen KR, Bahlo M, Treloar SA, Levinson DF, Mowry B, Barlow JW, Foote SJ:Identification and analysis of error types in high-throughput genotyping. Am J Hum Genet 2000,67(3):727-736.
    27. Pompanon F, Bonin A, Bellemain E, Taberlet P:Genotyping errors:causes, consequences and solutions. Nat Rev Genet 2005,6(11):847-859.
    28. O'Connell JR, Weeks DE:PedCheck:a program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet 1998, 63(1):259-266.
    29. Douglas JA, Boehnke M, Lange K:A multipoint method for detecting genotyping errors and mutations in sibling-pair linkage data. Am J Hum Genet 2000,66(4):1287-1297.
    30. Hosking L, Lumsden S, Lewis K, Yeo A, McCarthy L, Bansal A, Riley J, Purvis I, Xu CF:Detection of genotyping errors by Hardy-Weinberg equilibrium testing. EurJHum Genet 2004,12(5):395-399.
    31. Leal SM:Detection of genotyping errors and pseudo-SNPs via deviations from Hardy-Weinberg equilibrium. Genet Epidemiol 2005,29(3):204-214.
    32. Kang SJ, Gordon D, Brown AM, Ott J, Finch SJ:Tradeoff between no-call reduction in genotyping error rate and loss of sample size for genetic case/control association studies. Pac Symp Biocomput 2004:116-127.
    1. Hua J, Craig DW, Brun M, Webster J, Zismann V, Tembe W, Joshipura K, Huentelman MJ, Dougherty ER, Stephan DA:SNiPer-HD:improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays. Bioinformatics 2007,23(1):57-63.
    2. Xiao Y, Segal MR, Yang YH, Yeh RF:A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays. Bioinformatics 2007, 23(12):1459-1467. 1. Xu J, Turner A, Little J, Bleecker ER, Meyers DA:Positive results in association studies are associated with departure from Hardy-Weinberg equilibrium:hint for genotyping error? Hum Genet 2002, 111(6):573-574.
    2. Balding DJ:A tutorial on statistical methods for population association studies. Nat Rev Genet 2006,7(10):781-791.
    3. Tiret L, Cambien F:Departure from Hardy-Weinberg equilibrium should be systematically tested in studies of association between genetic markers and disease. Circulation 1995,92(11):3364-3365.
    4. Leal SM:Detection of genotyping errors and pseudo-SNPs via deviations from Hardy-Weinberg equilibrium. Genet Epidemiol 2005,29(3):204-214.
    5. Weir B:Disequilibrium. In:Genetic data analysis Ⅱ:methods for discrete population genetic data.. Sinaur Associates, Sunderland, MA, pp 91-139 1996.
    6. Hosking L, Lumsden S, Lewis K, Yeo A, McCarthy L, Bansal A, Riley J, Purvis I, Xu CF:Detection of genotyping errors by Hardy-Weinberg equilibrium testing. Eur J Hum Genet 2004,12(5):395-399.
    7. Clayton DG, Walker NM, Smyth DJ, Pask R, Cooper JD, Maier LM, Smink LJ, Lam AC, Ovington NR, Stevens HE et al: Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat Genet 2005,37(11):1243-1246.
    8. Robertson A, Hill WG:Deviations from Hardy-Weinberg proportions: sampling variances and use in estimation of inbreeding coefficients. Genetics 1984,107(4):703-718.
    9. Lee WC:Searching for disease-susceptibility loci by testing for Hardy-Weinberg disequilibrium in a gene bank of affected individuals. Am J Epidemiol 2003,158(5):397-400.
    10. Wittke-Thompson JK, Pluzhnikov A, Cox NJ:Rational inferences about departures from Hardy-Weinberg equilibrium. Am J Hum Genet 2005, 76(6):967-986.
    1. Weir B, Cockerham C:Estimating F-statistics for the analysis of population structure. Evolution 1984,38:1358-1370.
    2. Pritchard JK, Stephens M, Donnelly P:Inference of population structure using multilocus genotype data. Genetics 2000,155(2):945-959.
    3. Wake ley J, Nielsen R, Liu-Cordero SN, Ardlie K:The discovery of single-nucleotide polymorphisms--and inferences about human demographic history. Am JHum Genet 2001,69(6):1332-1347.
    4. Tajima F:Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 1989,123(3):585-595.
    5. Akey JM, Zhang G, Zhang K, Jin L, Shriver MD:Interrogating a high-density SNP map for signatures of natural selection. Genome Res 2002,12(12):1805-1814.
    6. Sabeti PC, Reich DE, Higgins JM, Levine HZ, Richter DJ, Schaffner SF, Gabriel SB, Platko JV, Patterson NJ, McDonald GJ et al: Detecting recent positive selection in the human genome from haplotype structure. Nature 2002,419(6909):832-837.
    7. Balding DJ:A tutorial on statistical methods for population association studies. Nat Rev Genet 2006,7(10):781-791.
    1. Rodriguez-Murillo L, Greenberg DA:Genetic association analysis:a primer on how it works, its strengths and its weaknesses. Int J Androl 2008, 31(6):546-556.
    2. Storey JD, Tibshirani R:Statistical significance for genomewide studies. Proc Natl Acad Sci U S A 2003,100(16):9440-9445.
    3. Wang WY, Barratt BJ, Clayton DG, Todd JA:Genome-wide association studies:theoretical and practical concerns. Nat Rev Genet 2005, 6(2):109-118.
    4. Gordon D, Finch SJ, Nothnagel M, Ott J:Power and sample size calculations for case-control genetic association tests when errors are present:application to single nucleotide polymorphisms. Hum Hered 2002, 54(1):22-33.
    5. Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, Ballinger DG, Struewing JP, Morrison J, Field H, Luben R et al: Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 2007,447(7148):1087-1093.
    6. Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder S, Wang Z, Welch R, Hutchinson A et al: A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 2007,39(7):870-874.
    7. Consortium TWTCC:Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 2007, 447(7145):661-678.
    8. Pompanon F, Bonin A, Bellemain E, Taberlet P:Genotyping errors:causes, consequences and solutions. Nat Rev Genet 2005,6(11):847-859.
    9. Clayton DG, Walker NM, Smyth DJ, Pask R, Cooper JD, Maier LM, Smink LJ, Lam AC, Ovington NR, Stevens HE et al: Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat Genet 2005,37(11):1243-1246.
    10. Plagnol V, Cooper JD, Todd JA, Clayton DG:A method to address differential bias in genotyping in large-scale association studies. PLoS Genet 2007,3(5):e74.
    1. Feuk L, Carson AR, Scherer SW:Structural variation in the human genome. Nat Rev Genet 2006,7(2):85-97.
    2. Estivill X, Armengol L:Copy number variants and common disorders: filling the gaps and exploring complexity in genome-wide association studies. PLoS Genet 2007,3(10):1787-1799.
    3. Carter NP:Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet 2007,39(7 Suppl):S 16-21.
    4. Cahan P, Godfrey LE, Eis PS, Richmond TA, Selzer RR, Brent M, McLeod HL, Ley TJ, Graubert TA:wuHMM:a robust algorithm to detect DNA copy number variation using long oligonucleotide microarray data. Nucleic Acids Res 2008,36(7):e41.
    5. Colella S, Yau C, Taylor JM, Mirza G, Butler H, Clouston P, Bassett AS, Seller A, Holmes CC, Ragoussis J:QuantiSNP:an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res 2007, 35(6):2013-2025.
    6. McCarroll SA, Kuruvilla FG, Korn JM, Cawley S, Nemesh J, Wysoker A, Shapero MH, de Bakker PI, Maller JB, Kirby A et al:Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat Genet 2008,40(10):1166-1174.
    7. Rigaill G, Hupe P, Almeida A, La Rosa P, Meyniel JP, Decraene C, Barillot E: ITALICS:an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays. Bioinformatics 2008,24(6):768-774.
    8. Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF, Hakonarson H, Bucan M:PennCNV:an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 2007,17(11):1665-1674.
    9. Khaja R, Zhang J, MacDonald JR, He Y, Joseph-George AM, Wei J, Rafiq MA, Qian C, Shago M, Pantano L et al:Genome assembly comparison identifies structural variants in the human genome. Nat Genet 2006, 38(12):1413-1418.
    10. Korbel JO, Urban AE, Affourtit JP, Godwin B, Grubert F, Simons JF, Kim PM, Palejev D, Carriero NJ, Du L et al:Paired-end mapping reveals extensive structural variation in the human genome. Science 2007, 318(5849):420-426.
    11. Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M et al:Large-scale copy number polymorphism in the human genome. Science 2004,305(5683):525-528.
    12. Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y, Scherer SW, Lee C:Detection of large-scale variation in the human genome. Nat Genet 2004,36(9):949-951.
    13. Locke DP, Sharp AJ, McCarroll SA, McGrath SD, Newman TL, Cheng Z, Schwartz S, Albertson DG, Pinkel D, Altshuler DM et al:Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. Am J Hum Genet 2006, 79(2):275-290.
    14. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W et al: Global variation in copy number in the human genome. Nature 2006,444(7118):444-454.
    15. Sharp AJ, Locke DP, McGrath SD, Cheng Z, Bailey JA, Vallente RU, Pertz LM, Clark RA, Schwartz S, Segraves R et al: Segmental duplications and copy-number variation in the human genome. Am J Hum Genet 2005, 77(1):78-88.
    16. Wong KK, deLeeuw RJ, Dosanjh NS, Kimm LR, Cheng Z, Horsman DE, MacAulay C, Ng RT, Brown CJ, Eichler EE et al: A comprehensive analysis of common copy-number variations in the human genome. Am J Hum Genet 2007,80(1):91-104.
    17. Conrad DF, Pinto D, Redon R, Feuk L, Gokcumen O, Zhang Y, Aerts J, Andrews TD, Barnes C, Campbell P et al: Origins and functional impact of copy number variation in the human genome. Nature 2009.
    18. Kidd JM, Cooper GM, Donahue WF, Hayden HS, Sampas N, Graves T, Hansen N, Teague B, Alkan C, Antonacci F et al: Mapping and sequencing of structural variation from eight human genomes. Nature 2008, 453(7191):56-64.
    19. Perry GH, Ben-Dor A, Tsalenko A, Sampas N, Rodriguez-Revenga L, Tran CW, Scheffer A, Steinfeld I, Tsang P, Yamada NA et al: The fine-scale and complex architecture of human copy-number variation. Am J Hum Genet 2008,82(3):685-695.
    20. Perry GH, Tchinda J, McGrath SD, Zhang J, Picker SR, Caceres AM, Iafrate AJ, Tyler-Smith C, Scherer SW, Eichler EE et al: Hotspots for copy number variation in chimpanzees and humans. Proc Natl Acad Sci U S A 2006, 103(21):8006-8011.
    21. Perry GH, Yang F, Marques-Bonet T, Murphy C, Fitzgerald T, Lee AS, Hyland C, Stone AC, Hurles ME, Tyler-Smith C et al: Copy number variation and evolution in humans and chimpanzees. Genome Res 2008, 18(11):1698-1710.
    22. Freeman JL, Perry GH, Feuk L, Redon R, McCarroll SA, Altshuler DM, Aburatani H, Jones KW, Tyler-Smith C, Hurles ME et al: Copy number variation:new insights in genome diversity. Genome Res 2006, 16(8):949-961.
    23. Zhang F, Gu W, Hurles ME, Lupski JR:Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet 2009, 10:451-481.
    24. Perry GH, Dominy NJ, Claw KG, Lee AS, Fiegler H, Redon R, Werner J, Villanea FA, Mountain JL, Misra R et al: Diet and the evolution of human amylase gene copy number variation. Nat Genet 2007,39(10):1256-1260.
    25. Ait Yahya-Graison E, Aubert J, Dauphinot L, Rivals I, Prieur M, Golfier G, Rossier J, Personnaz L, Creau N, Blehaut H et al: Classification of human chromosome 21 gene-expression variations in Down syndrome:impact on disease phenotypes. Am JHum Genet 2007,81(3):475-491.
    26. Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N, Redon R, Bird CP, de Grassi A, Lee C et al: Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 2007, 315(5813):848-853.
    27. Willer CJ, Speliotes EK, Loos RJ, Li S, Lindgren CM, Heid IM, Berndt SI, Elliott AL, Jackson AU, Lamina C et al: Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat Genet 2009,41(1):25-34.
    28. Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, Catano G, Nibbs RJ, Freedman BI, Quinones MP, Bamshad MJ et al: The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 2005,307(5714):1434-1440.
    29. Hollox EJ, Huffineier U, Zeeuwen PL, Palla R, Lascorz J, Rodijk-Olthuis D, van de Kerkhof PC, Traupe H, de Jongh G, den Heijer M et al: Psoriasis is associated with increased beta-defensin genomic copy number. Nat Genet 2008,40(1):23-25.
    30. Fanciulli M, Norsworthy PJ, Petretto E, Dong R, Harper L, Kamesh L, Heward JM, Gough SC, de Smith A, Blakemore AI et al: FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nat Genet 2007,39(6):721-723.
    31. Fellermann K, Stange DE, Schaeffeler E, Schmalzl H, Wehkamp J, Bevins CL, Reinisch W, Teml A, Schwab M, Lichter P et al: A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn disease of the colon. Am J Hum Genet 2006, 79(3):439-448.
    32. Singleton AB, Farrer M, Johnson J, Singleton A, Hague S, Kachergus J, Hulihan M, Peuralinna T, Dutra A, Nussbaum R et al: alpha-Synuclein locus triplication causes Parkinson's disease. Science 2003,302(5646):841.
    33. Marshall CR, Noor A, Vincent JB, Lionel AC, Feuk L, Skaug J, Shago M, Moessner R, Pinto D, Ren Y et al:Structural variation of chromosomes in autism spectrum disorder. Am J Hum Genet 2008,82(2):477-488.
    34. Weiss LA, Shen Y, Korn JM, Arking DE, Miller DT, Fossdal R, Saemundsen E, Stefansson H, Ferreira MA, Green T et al:Association between microdeletion and microduplication at 16p11.2 and autism. N Engl J Med 2008,358(7):667-675.
    35. Rovelet-Lecrux A, Hannequin D, Raux G, Le Meur N, Laquerriere A, Vital A, Dumanchin C, Feuillette S, Brice A, Vercelletto M et al:APP locus duplication causes autosomal dominant early-onset Alzheimer disease with cerebral amyloid angiopathy. Nat Genet 2006,38(1):24-26.
    36. Volik S, Raphael BJ, Huang G, Stratton MR, Bignel G, Murnane J, Brebner JH, Bajsarowicz K, Paris PL, Tao Q et al: Decoding the fine-scale structure of a breast cancer genome and transcriptome. Genome Res 2006, 16(3):394-404.
    37. Stephens PJ, McBride DJ, Lin ML, Varela I, Pleasance ED, Simpson JT, Stebbings LA, Leroy C, Edkins S, Mudie LJ et al: Complex landscapes of somatic rearrangement in human breast cancer genomes. Nature 2009, 462(7276):1005-1010.
    38. John T, Liu G, Tsao MS:Overview of molecular testing in non-small-cell lung cancer:mutational analysis, gene copy number, protein expression and other biomarkers of EGFR for the prediction of response to tyrosine kinase inhibitors. Oncogene 2009,28 Suppl 1:S14-23.
    39. Hastings PJ, Lupski JR, Rosenberg SM, Ira G:Mechanisms of change in gene copy number. Nat Rev Genet 2009,10(8):551-564.
    40. Turner DJ, Miretti M, Rajan D, Fiegler H, Carter NP, Blayney ML, Beck S, Hurles ME:Germline rates of de novo meiotic deletions and duplications causing several genomic disorders. Nat Genet 2008,40(1):90-95.
    41. Stankiewicz P, Lupski JR:Genome architecture, rearrangements and genomic disorders. Trends Genet 2002,18(2):74-82.
    42. Lieber MR:The mechanism of human nonhomologous DNA end joining. J Biol Chem 2008,283(1):1-5.
    43. Lee JA, Carvalho CM, Lupski JR:A DNA replication mechanism for generating nonrecurrent rearrangements associated with genomic disorders. Cell 2007,131(7):1235-1247.
    44. Hastings PJ, Ira G, Lupski JR:A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS Genet 2009,5(1):e1000327.
    45. Gu W, Zhang F, Lupski JR:Mechanisms for human genomic rearrangements. Pathogenetics 2008,1(1):4.
    46. Lupski JR:Genomic rearrangements and sporadic disease. Nat Genet 2007, 39(7 Suppl):S43-47.
    47. van Ommen GJ:Frequency of new copy number variation in humans. Nat Genet 2005,37(4):333-334.
    48. Bakar SA, Hollox EJ, Armour JA:Allelic recombination between distinct genomic locations generates copy number diversity in human beta-defensins. Proc Natl Acad Sci U S A 2009,106(3):853-858.
    49. Lam KW, Jeffreys AJ:Processes of copy-number change in human DNA: the dynamics of {alpha}-globin gene deletion. Proc Natl Acad Sci U S A 2006,103(24):8921-8927.
    50. Lam KW, Jeffreys AJ:Processes of de novo duplication of human alpha-globin genes. Proc Natl Acad Sci USA 2007,104(26):10950-10955.
    1. Freeman JL, Perry GH, Feuk L, Redon R, McCarroll SA, Altshuler DM, Aburatani H, Jones KW, Tyler-Smith C, Hurles ME et al: Copy number variation:new insights in genome diversity. Genome Res 2006, 16(8):949-961.
    2. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W et al: Global variation in copy number in the human genome. Nature 2006,444(7118):444-454.
    3. Locke DP, Sharp AJ, McCarroll SA, McGrath SD, Newman TL, Cheng Z, Schwartz S, Albertson DG, Pinkel D, Altshuler DM et al: Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. Am J Hum Genet 2006, 79(2):275-290.
    4. McCarroll SA, Kuruvilla FQ Korn JM, Cawley S, Nemesh J, Wysoker A, Shapero MH, de Bakker PI, Mailer JB, Kirby A et al: Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat Genet 2008,40(10):1166-1174.
    5. Perry GH, Ben-Dor A, Tsalenko A, Sampas N, Rodriguez-Revenga L, Tran CW, Scheffer A, Steinfeld I, Tsang P, Yamada NA et al: The fine-scale and complex architecture of human copy-number variation. Am J Hum Genet 2008,82(3):685-695.
    6. Griffiths RC, Marjoram P:An ancestral recombination graph. New York: Springer Verlag; 1997.
    7. Hudson RR:Properties of a neutral allele model with intragenic recombination. Theor Popul Biol 1983,23(2):183-201.
    8. Stephens M:Inference under the coalescent:John Wiley & Sons, Chichester; 2001.
    9. Myers SR, Griffiths RC:Bounds on the minimum number of recombination events in a sample history. Genetics 2003,163(1):375-394.
    10. Gusfield D, Eddhu S, Langley C:Optimal, efficient reconstruction of phylogenetic networks with constrained recombination. J Bioinform Comput Biol 2004,2(1):173-213.
    11. Song YS, Hein J:Constructing minimal ancestral recombination graphs. J Comput Biol 2005,12(2):147-169.
    12. Minichiello MJ, Durbin R:Mapping trait loci by use of inferred ancestral recombination graphs. Am J Hum Genet 2006,79(5):910-922.
    13. Excoffier L, Slatkin M:Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol 1995, 12(5):921-927.
    14. Stephens M, Smith NJ, Donnelly P:A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 2001, 68(4):978-989.
    15. Scheet P, Stephens M:A fast and flexible statistical model for large-scale population genotype data:applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 2006,78(4):629-644.
    16. Browning SR, Browning BL:Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 2007,81(5):1084-1097.
    17. Kato M, Nakamura Y, Tsunoda T:An algorithm for inferring complex haplotypes in a region of copy-number variation. Am J Hum Genet 2008, 83(2):157-169.
    18. Qin ZS, Niu T, Liu JS:Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am J Hum Genet 2002,71(5):1242-1247.
    1. Laval G, Excoffier L:SIMCOAL 2.0:a program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history. Bioinformatics 2004,20(15):2485-2487.
    2. Kong A, Gudbjartsson DF, Sainz J, Jonsdottir GM, Gudjonsson SA, Richardsson B, Sigurdardottir S, Barnard J, Hallbeck B, Masson G et al:A high-resolution recombination map of the human genome. Nat Genet 2002, 31(3)241-247.
    3. Myers S, Bottolo L, Freeman C, McVean G, Donnelly P:A fine-scale map of recombination rates and hotspots across the human genome. Science 2005, 310(5746):321-324.
    4. Frazer KA, Ballinger DG, Cox DR., Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM et al:A second generation human haplotype map of over 3.1 million SNPs. Nature 2007,449(7164):851-861.
    1. Conrad DF, Pinto D, Redon R, Feuk L, Gokcumen O, Zhang Y, Aerts J, Andrews TD, Barnes C, Campbell P et al:Origins and functional impact of copy number variation in the human genome. Nature 2010,464:704-712.
    2. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM et al:A second generation human haplotype map of over 3.1 million SNPs. Nature 2007,449(7164):851-861.
    3. Conrad DF, Pinto D, Redon R, Feuk L, Gokcumen O, Zhang Y, Aerts J, Andrews TD, Barnes C, Campbell P et al:Origins and functional impact of copy number variation in the human genome. Nature 2009.
    4. Weir B, Cockerham C:Estimating F-statistics for the analysis of population structure. Evolution 1984,38:1358-1370.
    5. Weir BS, Hill WG:Estimating F-statistics. Annu Rev Genet 2002, 36:721-750.
    6. Schaffner SF, Foo C, Gabriel S, Reich D, Daly MJ, Altshuler D:Calibrating a coalescent simulation of human genome sequence variation. Genome Res 2005,15(11):1576-1583.
    7. Zhang F, Gu W, Hurles ME, Lupski JR:Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet 2009, 10:451-481.
    8. Kong A, Gudbjartsson DF, Sainz J, Jonsdottir GM, Gudjonsson SA, Richardsson B, Sigurdardottir S, Barnard J, Hallbeck B, Masson G et al:A high-resolution recombination map of the human genome. Nat Genet 2002, 31(3):241-247.
    9. Myers S, Bottolo L, Freeman C, McVean G, Donnelly P:A fine-scale map of recombination rates and hotspots across the human genome. Science 2005, 310(5746):321-324.
    10. Lam KW, Jeffreys AJ:Processes of copy-number change in human DNA: the dynamics of {alpha}-globin gene deletion. Proc Natl Acad Sci U S A 2006,103(24):8921-8927.
    11. Lam KW, Jeffreys AJ:Processes of de novo duplication of human alpha-globin genes. Proc Natl Acad Sci U S A 2007,104(26):10950-10955.
    12. Bakar SA, Hollox EJ, Armour JA:Allelic recombination between distinct genomic locations generates copy number diversity in human beta-defensins. Proc Natl Acad Sci U S A 2009,106(3):853-858.
    13. Stankiewicz P, Lupski JR:Genome architecture, rearrangements and genomic disorders. Trends Genet 2002,18(2):74-82.
    14. Samonte RV, Eichler EE:Segmental duplications and the evolution of the primate genome. Nat Rev Genet 2002,3(1):65-72.
    15. Bailey JA, Eichler EE:Primate segmental duplications:crucibles of evolution, diversity and disease. Nat Rev Genet 2006,7(7):552-564.
    16. Freeman JL, Perry GH, Feuk L, Redon R, McCarroll SA, Altshuler DM, Aburatani H, Jones KW, Tyler-Smith C, Hurles ME et al:Copy number variation:new insights in genome diversity. Genome Res 2006, 16(8):949-961.
    17. Zhang F, Carvalho CM, Lupski JR:Complex human chromosomal and genomic rearrangements. Trends Genet 2009,25(7):298-307.
    18. Linardopoulou EV, Williams EM, Fan Y, Friedman C, Young JM, Trask BJ: Human subtelomeres are hot spots of interchromosomal recombination and segmental duplication. Nature 2005,437(7055):94-100.
    19. Perry GH, Tchinda J, McGrath SD, Zhang J, Picker SR, Caceres AM, Iafrate AJ, Tyler-Smith C, Scherer SW, Eichler EE et al:Hotspots for copy number variation in chimpanzees and humans. Proc Natl Acad Sci U S A 2006, 103(21):8006-8011.
    20. Zhang F, Khajavi M, Connolly AM, Towne CF, Batish SD, Lupski JR:The DNA replication FoSTeS/MMBIR mechanism can generate genomic, genic and exonic complex rearrangements in humans. Nat Genet 2009, 41(7):849-853.
    21. Itsara A, Cooper GM, Baker C, Girirajan S, Li J, Absher D, Krauss RM, Myers RM, Ridker PM, Chasman DI et al:Population analysis of large copy number variants and hotspots of human genetic disease. Am J Hum Genet 2009,84(2):148-161.
    22. Hollox EJ, Huffmeier U, Zeeuwen PL, Palla R, Lascorz J, Rodijk-Olthuis D, van de Kerkhof PC, Traupe H, de Jongh G, den Heijer M et al:Psoriasis is associated with increased beta-defensin genomic copy number. Nat Genet 2008,40(1):23-25.
    23. Yang Y, Chung EK, Wu YL, Savelli SL, Nagaraja HN, Zhou B, Hebert M, Jones KN, Shu Y, Kitzmiller K et al:Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE):low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans. Am J Hum Genet 2007,80(6):1037-1054.
    24. Huang RS, Chen P, Wisel S, Duan S, Zhang W, Cook EH, Das S, Cox NJ, Dolan ME:Population-specific GSTM1 copy number variation. Hum Mol Genet 2009,18(2):366-372.
    25. Sharp AJ, Mefford HC, Li K, Baker C, Skinner C, Stevenson RE, Schroer RJ, Novara F, De Gregori M, Ciccone R et al:A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures. Nat Genet 2008,40(3):322-328.
    26. Shinawi M, Schaaf CP, Bhatt SS, Xia Z, Patel A, Cheung SW, Lanpher B, Nagl S, Herding HS, Nevinny-Stickel C et al:A small recurrent deletion within 15q13.3 is associated with a range of neurodevelopmental phenotypes. Nat Genet 2009,41(12):1269-1271.
    27. Brunetti-Pierri N, Berg JS, Scaglia F, Belmont J, Bacino CA, Sahoo T, Lalani SR, Graham B, Lee B, Shinawi M et al:Recurrent reciprocal Iq21.1 deletions and duplications associated with microcephaly or macrocephaly and developmental and behavioral abnormalities. Nat Genet 2008, 40(12):1466-1471.
    28. Turner DJ, Miretti M, Rajan D, Fiegler H, Carter NP, Blayney ML, Beck S, Hurles ME:Germline rates of de novo meiotic deletions and duplications causing several genomic disorders. Nat Genet 2008,40(1):90-95.
    29. Lupski JR:Genomic rearrangements and sporadic disease. Nat Genet 2007, 39(7 Suppl):S43-47.
    30. van Ommen GJ:Frequency of new copy number variation in humans. Nat Genet 2005,37(4):333-334.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700