用户名: 密码: 验证码:
生物序列表征体系构建及结构与功能关系研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
生物序列(肽、蛋白质及核酸)结构表征是其结构与功能关系研究中的重要内容及关键前提,序列表征描述子是否能够合理地反映与其功能密切相关的结构信息,决定其结构与功能关系研究的成败。因为决定生物序列功能的结构信息被编码在其一级序列之中,因此,解析其一级序列特征对于生物序列的结构与功能关系研究至关重要。文中综合考察各种生物序列的一级序列特征,构建了两种生物序列结构表征体系,包括:①收集20种天然氨基酸的516种多维性质参数,经因子分析得广义氨基酸信息因子分析标度(FASGAI);②收集5种碱基的1209种多维性质参数,经主成分分析得广义碱基性质得分(SGBP)。研究结果显示,两种表征体系都具有物理化学意义明确,表征能力强,拓展性能好及操作简便等优点。
     将FASGAI分别用于苦味二肽、血管紧张素转化酶抑制剂及阳离子抗菌肽的定量构效关系(QSAR)研究,人免疫缺陷病毒蛋白酶(HIV PR)裂解位点预测及特异性分析,HLA-A*0201限制性T细胞表位及人类1型双载蛋白SH3结构域亲和肽的QSAR研究,都取得较好的结果。研究显示,苦味二肽的生物活性与其第1残基的体积性质,第2残基的体积性质与疏水性等性质可能存在较大的正相关关系,而与其第2残基的α-螺旋与转角倾向等性质可能存在较大的负相关关系;血管紧张素转化酶抑制剂活性的第2残基的体积性质与疏水性及第1残基的静电性等性质参数的增大可能有利于其活性的提高,而第2残基的构成特征等性质参数的增大可能易导致其活性的降低;阳离子抗菌肽的第10残基的静电性质,第7残基的体积性质,第12残基的疏水性及第3残基的静电性等性质可能对抗菌活性产生较大的正贡献,而第6残基的疏水性及构成特征,第10残基的疏水性等性质可能对抗菌活性产生较大的负贡献;对HIV PR裂解位点预测及特异性分析知,HIV PR可能识别8肽序列中特定位点的关键特征,第1、2、4、5和6残基的体积性质、二级结构信息、静电性质及疏水性等可能是决定HIV PR是否裂解的重要因素,特别地,体积性质可能是HIV PR被识别的重要特征;HLA-A*0201限制性T细胞表位的第3残基的体积性质与疏水性,第2残基的体积性质及第9残基的疏水性等性质可能对亲和性的正贡献较大,而第4残基的疏水性及第3残基的局部柔性等性质可能对亲和性的负贡献较大;分析影响具有10个残基(P-5P-4P-3P-2P-1P0P1P2P3P4)的人类1型双载蛋白SH3结构域亲和肽亲和性的关键作用力知,第P-3与第P2之间残基(含P-3与P2残基)的相应性质可能对亲和性影响较显著,特别地,第P-3残基的静电性质与疏水性可能对于其亲和性的正、负贡献分别相对最大。
     发展了全新的不依赖于序列同源性及结构相似性的蛋白结构与功能预测方法。将FASGAI分别用于碱性螺旋-环-螺旋(bHLH)蛋白,蛋白质的β-转角结构,G蛋白偶联受体家族及高致病性禽流感病毒血凝素蛋白分类或识别研究。结果显示,对bHLH蛋白分类影响较显著的变量大多来自其功能基序(第1到第13残基)的第5、8、9及13等位点,少数来自第4、6、10及12等位点,表明这些相应位点的变量可能是DNA亲和区域的一些重要识别特征,方差分析显示,在第5、8、9及13位点,除了第8残基的局部柔性与第9残基的体积性质外,其它性质都存在不同程度的显著差异,利用这些差异可较好地分类bHLH蛋白;β-转角结构预测结果表明,FASGAI可较好地表征β-转角残基特征,且其能提供β-转角的一些重要特征信息;经FASGAI表征,自交叉协方差(ACC)转换,支持向量机(SVM)建模用于G蛋白偶联受体家族及高致病性禽流感病毒血凝素蛋白识别所得结果显示,FASGAI是一种优良的蛋白序列结构表征方法,同时,FASGAI-ACC-SVM方法为G蛋白偶联受体家族及禽流感病毒血凝素蛋白识别提供了新的研究思路。
     将SGBP分别用于大肠杆菌启动子的启动强度及人类基因启动子预测,都取得较好的结果。研究表明,大肠杆菌启动子(-49 bp到+19 bp)的-45,-38,-28,-27,-22,-21,-5,+4,+8,+14及+15等位点碱基的性质可能对启动强度具有较显著的影响,这为启动子的启动强度预测及序列设计提供了可能。以SGBP表征,ACC转换,SVM建模预测人类基因启动子(-250 bp到+50 bp),所得结果不同程度地相当于或优于其它所对比的预测方法。SGBP-ACC-SVM过程建模可以进一步尝试用于其它启动子识别,mRNA转录特性与RNA二级结构预测等。
     针对性地比较研究了各种QSAR建模与模式识别方法,特别是偏最小二乘(PLS)、线性判别分析(LDA)及SVM等在生物序列结构与功能关系研究中的应用,其中包含了对变量筛选、参数选择及模型验证等内容的研究和讨论。结果表明,PLS可较好地解决变量数较多且存在多重共线性的情况。LDA用于模式识别所得结果稳健,模型易解释。SVM能较好地解决小样本、非线性、高维数和局部最小等问题,使其在生物序列结构与功能关系研究中具有广阔的应用前景,但其在参数设置等问题上有待进一步研究,文中探索性地将响应面分析法用于SVM的参数设置,结果证明该方法对于其参数设置是较有效的。文中选择性地采用逐步多元回归、遗传算法及逐步方法筛选变量,研究发现,三种方法都可较好地去除原始变量中的噪声信息。文中通过内部和外部双重验证评价模型质量,采用的内部验证方法有自检验、留一法及留组法验证等,在内部验证的基础上,利用预测集样本对模型进行外部预测能力评价,以确保所得模型的有效性。
Representation for biological sequences (peptides, proteins and nucleic acids) is crucial to investigate their structure-activity relationship. The structural descriptors for biological sequences should reflect the structural information tightly related to their activities, which determines the success of study on their structure-activity relationship. The structures related to the activities of biological sequences are determined by the information contained in their primary sequences. Therefore, investigation on characteristics of the primary sequences for biological sequences has great significance in study on their structure-activity relationship. The representation techniques were constructed in this dissertation considering diversified properties and activities of biological sequences, including:①Factor analysis scales of generalized amino acid information (FASGAI) derived from 516 property parameters of 20 coded amino acids;②Scores of generalized base properties (SGBP) derived from principal component analysis of a matrix of 1209 property parameters. Satisfying results demonstrated that both FASGAI and SGBP vectors have many distinct characteristics such as straightforward physicochemical meaning, high characterization competence, convenient expansibility and easy manipulation.
     FASGAI vectors were applied to represent the structures of several functional peptides, including bitter tasting thresholds, angiotensin-converting enzyme inhibitors, cationic antimicrobial peptides, octapeptides cleaved by HIV-1 protease (HIV PR), HLA-A*0201 restrictive T-cell epitopes and decapeptides binding to SH3 domain of human protein Amphiphysin-1. Further, favorable quantitative structure-activity relationship (QSAR) models were developed using various modeling techniques and methods. The results showed that the activities of bitter tasting dipeptides may be highly positively correlated to bulky properties of the 1st residue, bulky properties and hydrophobicity of the 2nd residue and so on, and may be highly negatively correlated toα-helix and turn propensities of the 2nd residue and so on. It can be concluded from investigation on the structural information related to the activities of angiotensin-converting enzyme inhibitors that the improvement for bulky properties and hydrophobicity of the 2nd residue, electronic properties of the 1st residue and so on may enhance their activities, and also, the improvement for compositional characteristics of the 2nd residue and so on may restrain their activities. It can be found that electronic properties of the 10th residue, bulky properties of the 7th residue, hydrophobicity of the 12th residue and electronic properties of the 3rd residue and so on may generate high positive effect on the activities of antimicrobial peptides, and also, hydrophobicity and compositional characteristics of the 6th residue and hydrophobicity of the 10th residue and so on may generate high negative effect on antibacterial activities. It can be concluded that HIV PR may recognize diversitied key properties of various sites in the octameric sequences. These diversified properties including bulky properties, secondary conformation characteristics, electronic properties and hydrophobicity of the 1st, 2nd, 4th, 5th and 6th residues and so on may be important factors in determining HIV PR cleavage or not, and particularly, bulky properties of the corresponding sites may be key features recognized by HIV PR. Investigation on properties tightly related to the affinities of HLA-A*0201 restrictive T-cell epitopes demonstrated that bulky properties and hydrophobicity of the 3rd residue, bulky properties of the 2nd residue and hydrophobicity of the 9th residue and so on may positively contribute most to the affinities, and also, hydrophobicity of the 4th residue and local flexibility of the 3rd residue and so on may negatively contribute most to the affinities. Diversified properties of the residues between the P-3 site and the P2 site (including the P-3 site and the P2 site) for the decapeptide (P-5P-4P-3P-2P-1P0P1P2P3P4) may contribute remarkable effect to the interactions between human Amphiphysin-1 SH3 domain and the decapeptide. Particularly, electronic properties of the P-3 residue may provide large positive contribution on the interactions, and hydrophobicity of the P-3 residue may provide large negative contribution on the interactions.
     Original prediction techniques independent of sequence homology and structure similarity were developed to predict structure-activity relationship for the proteins. FASGAI vectors were used to identify basic helix-loop-helix (bHLH) proteins,β-turns of proteins, G-protein-coupled receptors (GPCRs) and hemagglutinins of high pathogenic avian influenza virus (HPAIV). It can be concluded that remarkable influence was from the property parameters of the 5th, 8th, 9th and 13th sites in the motif with the 1st 13 residues in bHLH protein sequences, and little remarkable influence was from the property parameters of the 4th, 6th, 10th and 12th sites. This displayed that these properties may be key features recognized for the DNA binding region. Investigation by analysis of variance indicated that there may be significant difference between these property parameters of the 5th, 8th, 9th and 13th sites except local flexibility of the 8th residue and bulky properties of the 9th residue. Therefore, these properties may be utilized to identify bHLH proteins. Satisfying results of prediction forβ-turns showed that characteristics ofβ-turn residues were well represented by FASGAI vectors, meanwhile, some important information related toβ-turn residues was obtained. FASGAI-ACC-SVM methodology involving FASGAI representation, auto cross covariance (ACC) transform and support vector machine (SVM) modeling was utilized to identify GPCRs and hemagglutinins of HPAIV. The results demonstrated that FASGAI vectors are excellent representation technique for protein sequences. FSAGAI-ACC-SVM methodology has thus pointed us further into the direction of identification for GPCRs and hemagglutinins of HPAIV.
     SGBP vectors were employed to predict promoter strengths of E.coli promoters and identify human genome promoters. It can be concluded that properties of base position -45, -38, -28, -27, -22, -21, -5, +4, +8, +14 and +15 and so on may yield remarkable influence on promoter strengths of E.coli promoters with 68 base pairs (-49 bp~+19 bp), which has thus pointed us further into the direction of strong promoters. The results for prediction of human genome promoters (-250 bp~+50 bp) revealed that there is a wide prospect for applications of the methodology, i.e., SGBP-ACC-SVM involving SGBP representation, ACC transform and SVM modeling, in prediction of other promoters, transcription properties of mRNA and secondary structure of RNA and so on.
     The modeling and the pattern recognition methods, particularly partial least square (PLS), linear discriminant analysis (LDA) and SVM, were investigated. The techniques involving variable selection, parameter determination and model validation were also discussed in this dissertation. The results showed that PLS can well avoid harmful effects in modeling due to multicollinearity, and is particularly fit for regressing when the number observation is less than the number of the variables. Models developed by LDA are robust and interpreted. As a new machine learning arithmetic, SVM can well deal with small dataset, nonlinear optimization, high-dimensional feature space, local minimization and so on. These results showed that there is a wide prospect for the applications of SVM in study on structure-activity relationship for biological sequences. However, there are many issues, i.e., selection of kernel functions and corresponding parameters, leaving to be studied in detailed. Parameters of SVM were tentatively determined by response surface methodology in order to acquire reliable results. The results demonstrated that the methodology is effective for parameter determination of SVM. Besides, stepwise multiple regression, genetic algorithm and a stepwise manner were used to optimize variable subsets. The results indicated that three methods for variable selection can efficiently dismiss noise of original variables. Self-consistency, leave one out and leave group out test were used to carry out internal validations. On the base of internal validations, external validations were performed by using the predictive data set in order to ensure the validity of the models obtained.
引文
[1] 陈凯先, 蒋华良, 嵇汝运. 计算机辅助药物设计——原理、方法及应用. 上海: 上海科学技术出版社, 2000
    [2] 徐筱杰, 侯廷军, 乔学斌等. 计算机辅助药物分子设计. 北京: 化学工业出版社, 2004
    [3] 郭宗儒. 药物分子设计. 北京: 科学出版社, 2005
    [4] 李仁利. 药物构效关系. 北京: 中国医药科技出版社, 2004
    [5] 梁桂兆, 梅虎, 周原等. 计算机辅助药物设计中的多维定量构效关系模型化方法. 化学进展, 2006, 18(1): 120-127
    [6] Urbina JA, Payares G, Molina J et al. Cure of short and long-term experimental Chaga’s disease using D0870. Science, 1996, 2731: 5277-5280
    [7] Kubinyi H. From narcosis to hyperspace: The history of QSAR. Quant. Struct.-Act. Relat., 2002, 21: 348-356
    [8] Hansch C, Fujita T. Correlation of biological activity of phenoxyacetic acids with hammett substituent constants and partition coefficient. Nature, 1962, 194: 178-179
    [9] Hansch C, Fujita T. p-σ-π analysis. A method for the correlation of biological activity and chemical structure. J. Am. Chem. Soc., 1964, 86(8): 1616-1626
    [10] Free SMJr, Wilson JW. A mathematical contribution to structure activity studies. J. Med. Chem., 1964, 7: 395-399
    [11] Hansch C, Muir M, Fujita T et al. The correlation of biological activity of plant growth regulators and chloromycetin derivatives with Hammett constants and partition coefficients. J. Am. Chem. Soc., 1963, 85: 2817-2824
    [12] Fujita T, Ban T. Structure-activity study of phenethylamines as substrates of biosynthetic enzymes of sympathetic transmitters. J. Med. Chem., 1971, 14: 148-152
    [13] Unger SH, Hansch C. On model building in structure-activity relations. A reexamination of adrenergic blocking activity of beta-halo-beta-arylalkylamines. J. Med. Chem., 1973, 16: 745-749
    [14] Randic M. On characterization of molecular branching. J. Am. Chem. Soc., 1975, 97(23): 6609-6615
    [15] Kier LB, Murray WJ, Hall LH. Molecular connectivity 4: Relationship to biological activity. J. Med. Chem., 1975, 18(12): 1272-1274
    [16] Kier LB, Hall LH. An electrotopological state for atoms in molecules. J. Pharm. Res., 1990, 7: 801-807
    [17] Liu SS, Cai SX, Cao CZ et al. Molecular electronegative distance vector (MEDV) relating to 15 properties of alkanes. J. Chem. Inf. Comput. Sci., 2001, 40(6): 1337-1348
    [18] Liu SS, Yin CS, Cai SX et al. QSAR study of steroid benchmark and dipeptides based on MEDV-13. J. Chem. Inf. Comput. Sci., 2001, 41(2): 321-329
    [19] Cramer RD, Patterson DE, Bunce JD. Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J. Am. Chem. Soc., 1988, 110: 5959-5967
    [20] Xu Y, Liu H, Niu CY et al. Molecular docking and 3D QSAR studies on 1-amino-2- phenyl-4-(piperidin-1-yl)-butanes based on the structural modeling of human CCR5 receptor. Bioorg. Med. Chem., 2004, 12: 6193-6208
    [21] Doweyko AM. The hypothetical active site lattic. An approach to modeling active sites from data on inhibitor molecules. J. Med. Chem., 1988, 31: 1396-1406
    [22] Todeschini R, Lasagni M, Marengo E. New molecular descriptors for 2D and 3D structures. Theory. J. Chemom., 1994, 8: 263-272
    [23] Ginn CMR, Turner DB, Willett P et al. Similarity searching in files of three-dimensional chemical structures: Evaluation of the EVA descriptor and combination of rankings using data fusion. J. Chem. Inf. Comput. Sci., 1997, 37: 23-27
    [24] Menezes IRA, Lopes JCD, Montanari CA et al. 3D QSAR studies on binding affinities of coumarin natural products for glycosomal GAPDH of Trypanosoma cruzi. J. Comput. Aid. Mol. Des., 2003, 17: 277-290
    [25] Hasegawa K, Matsuoka S, Arakawa M et al. New molecular surface-based 3D-QSAR method using Kohonen neural network and 3-way PLS. J. Comput. Chem., 2002, 26:583-589
    [26] Polanski J, Gieleciak R, Bak A. The comparative molecular surface analysis (COMSA)-A nongrid 3D QSAR method by a coupled neural network and PLS system: Predicting pKa values of benzoic and alkanoic acids. J. Chem. Inf. Comput. Sci., 2002, 42: 184-191
    [27] Kubinyi H. QSAR and 3D QSAR in drug design. Part 1: methodology. Drug Discov.Today, 1997, 2: 457-467
    [28] Hopfinger AJ, Wang S, Tokarski JS et al. Construction of 3D-QSAR models using the 4D-QSAR analysis formalism. J. Am. Chem. Soc., 1997, 119: 10509-10524
    [29] Albuquerque MG, Hopfinger AJ, Barreiro EJ et al. Four-dimensional quantitative structure-activity relationship analysis of a series of interphenylene 7-oxabicycloheptane oxazole thromboxane A2 receptor antagonists. J. Chem. Inf. Comput. Sci., 1998, 38: 925-938
    [30] Vedani A, Briem H, Dobler M et al. Multiple conformation and protonation state representation in 4D-QSAR: The neurokinin-1 receptor system. J. Med. Chem., 2000, 43: 4416-4427
    [31] Vedani A, Dober M. 5D QSAR: The key for simulating induced fit?. J. Med. Chem., 2002,45(11): 2139-2149
    [32] Vedani A, Dobler M. Multi-dimentinal QSAR in drug research. Predicting binding affinities, toxicity and pharmacokinetic parameters. Prog. Drug Res., 2000, 55: 105-135
    [33] Vedani A, Dobler M, Lill MA. Combining protein modeling and 6D-QSAR. Simulating the binding of structurally diverse ligands to the estrogen receptor. J. Med. Chem., 2005, 48: 3700-3703
    [34] 王连生, 韩朔睽. 分子结构、性质与活性. 北京: 化学工业出版社, 1997
    [35] Karelson M, Lobanov VS, Katritzky AR. Quantum-chemical descriptors in QSAR/QSPR studies. Chem. Rev., 1996, 96: 1027-1043
    [36] Livingstone DJ. The characterization of chemical structures using molecular properties. A survey. J. Chem. Inf. Comput. Sci., 2000, 40: 195-209
    [37] Sneath PH. Relations between chemical structure and biological activity in peptides. J. Theor. Biol., 1966, 12(2): 157-195
    [38] Kidera A, Konishi Y, Oka M et al. A statistical analysis of the physical properties of the 20 naturally occuring amino acids. J. Protein Chem., 1985, 4: 23-55
    [39] Hellberg S, Sj?str?m M, Skagerberg B et al. Peptide quantitative structure-activity relationships, a multivariate approach. J. Med. Chem., 1987, 30: 1126-1135
    [40] Hellberg S, Eriksson L, Jonsson J et al. Minimum analogue peptide sets (MAPS) for quantitative structure-activity relationships. Int. J. Pept. Protein Res., 1991, 37: 414-424
    [41] Sandberg M, Eriksson L, Jonsson J et al. New chemical descriptors for the design of biologically active peptides. a multivariate charaterrization of 87 amino acids. J. Med.Chem., 1998, 41: 2481-2491
    [42] Cocchi M, Johansson E. Amino acids characterization by GRID and multivariate data analysis. Quant. Struct.-Act. Relat., 1993, 12(4): 1-8
    [43] Kim J, Nam KY, Cho KH et al. Theoretical study on hydrophobicity of amino acids by the solvation free energy density model. Bull Korean Chem Soc., 2003, 24(12): 1742-1750
    [44] Collantes ER, Dunn WJ. Amino acid side chain descriptors for quantitative structure activity relationship studies of peptide analogues. J. Med. Chem., 1995, 38: 2705-2713
    [45] Zaliani A, Gancia E. MS-WHIM scores for amino acids: A new 3D-description for peptide QSAR and QSPR studies. J. Chem. Inf. Compt. Sci., 1999, 39: 525-533
    [46] Raychaudhury C, Banerjee A, Bag P et al. Topological shape and size of peptides: Identification of potential allele specific helper T cell antigenic sites. J. Chem. Inf. Comput. Sci., 1999, 39: 248-254
    [47] Andersson PM, Sj?strom M, Lundstedt T. Preprocessing peptide sequences for multivariatesequence-property analysis. Chemometr. Intell. Lab. Syst., 1998, 42: 41-50
    [48] Patel S, Stott IP, Bhakko M et al. Patenting computer-designed peptides. J. Comput. Aid. Mol. Des., 1998, 12: 543-556
    [49] Sotomatsu-Niwa T, Ogino A. Evaluation of the hydrophobic of the amino acids side chains of peptides and their application in QSAR and conformational studies. J. Mol. Struct. (Theochem)., 1997, 392: 43-54
    [50] Liu SS, Yin CS, Cai SX et al. A novel MHDV descriptor for dipeptide QSAR studies. J. Chin. Chem. Soc., 2001, 48: 253-260
    [51] Zhou P, Tian FF, Zhang MJ et al. Applying generalized hydrophobicity scale of amino acids to quantitative prediction of human leukocyte antigen-A*0201-restricted cytotoxic T lymphocyte epitope. Chin. Sci. Bull., 2006, 51(12): 1439-1443
    [52] 梁桂兆, 周鹏, 周原等. 一组新氨基酸描述子用于肽定量构效关系研究. 化学学报, 2006, 64(5): 393-396
    [53] 梁桂兆, 梅虎, 周原等. 氨基酸描述子 SZOTT 用于多肽定量序效建模研究. 高等学校化学学报, 2006, 27(10): 1900-1902
    [54] 梁桂兆, 李志良, 周原等. 一种新多肽表征方法及支持向量机用于肽 HPLC 定量结构-保留建模预测. 物理化学学报, 2006, 22(9): 1052-1055
    [55] Liang GZ, Yang SB, Zhou Y et al. Using scores of amino acid topological descriptors for quantitative sequence-mobility modeling of peptides based on support vector machine. Chin. Sci. Bull., 2006, (51)22: 2700-2705
    [56] Liang GZ, Li ZL. A new sequence representation (FASGAI) as applied in better specificity elucidation for human immunodeficiency virus type 1 protease. Biopolymers (Pept. Sci.), 2007, DOI 10.002/bip 20669
    [57] Liang GZ, Li ZL. Factor analysis scale of generalized amino acid information as the source of a new set of descriptors for elucidating the structure and activity relationships of cationic antimicrobial peptides. QSAR Comb. Sci., 2007, DOI: 10.1002/qsar.200630145
    [58] 丁俊杰, 晓琴, 赵立峰等. 多肽定量构效关系与分子设计. 化学进展, 2005, 17: 130-136
    [59] 来鲁华. 蛋白质的结构预测. 中国科学基金, 1998, 11: 45-47
    [60] 来鲁华. 蛋白质的结构预测与分子设计. 北京: 北京大学出版社, 1993
    [61] 阎隆飞, 孙之荣. 蛋白质分子结构. 北京: 清华大学出版社, 1999
    [62] Anfinsen CΒ. Principles that govern the folding of protein chains. Science, 1973, 181(4096): 223-230
    [63] Nemethy G, Scheraga HA. Protein folding. Q. Rev. Biophys., 1977, 10(3): 239-252
    [64] Chou KC. Energy-optimized structure of antifreeze protein and its binding mechanism. J. Mol.Biol., 1992, 223, 509-517
    [65] Chou PY, Fasman GD. Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. Biochemistry, 1974, 13: 222-245
    [66] Lim VI. Structural principles of globular protein secondary structure. J. Mol. Biol., 1974, 88: 857-872
    [67] Garier J, Osguthorpe DJ, Robson B. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol., 1978, 120: 97-120
    [68] Chou PY. Prediction of protein structural classes from amino acid composition. In prediction of protein structure and the principles of protein conformation (Fasman GD, Ed.). New York: Plenum Press, 1989
    [69] Klein P. Prediction of protein structural class by discriminant analysis. Biochim. Biophys. Acta, 1986, 874: 205-215
    [70] Nakashima H, Nishikawa K, Ooi T. The folding type of a protein is relevant to the amino acid composition. J. Biochem., 1986, 99: 152-162
    [71] Chou KC, Zhang CT. Prediction of protein structural classes. Crit. Rev. Biochem. Mol. Biol., 1995, 30: 275-349
    [72] Chou KC. Review: Prediction of protein structural classes and subcellular location. Curr. Protein Pept. Sci., 2001, 1: 171-208
    [73] Kabsch W, Sander C. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 1983, 22: 2577-2637
    [74] Chou PY, Fasman GD. Prediction of secondary structure of proteins from amino acid sequence. Adv. Enzymol. Relat. Subj. Biochem., 1978, 47: 45-148
    [75] Fasman GD. The development of the prediction of protein structure. In prediction of protein structure and the principles of protein conformation (Fasman GD, Ed.). New York: Plenum Press, 1989
    [76] Gibrat JF, Garnier J, Robson B. Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J. Mol. Biol., 1987, 198: 425-443
    [77] Lim VI. Structural principles of globular protein secondary structure. J. Mol. Biol., 1974, 88: 873-894
    [78] Levin JM, Robson B, Garnier J. An algorithm for secondary structure determination in proteins based on sequence similarity. FEBS Lett., 1986, 205(2): 303-308
    [79] Cohen FE, Abarbanel RM, Kuntz ID et al. Secondary structure assignment for alpha/betaproteins by a combinatorial approach. Biochemistry, 1983, 22: 4894-4904
    [80] Qian N, Sejnowski TJ. Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol., 1988, 202(4): 865-884
    [81] 邹承鲁. 周海梦. 潘宪明等. 第二遗传密码?. 长沙: 湖南科学技术出版社, 1997
    [82] Werner T. Models for prediction and recognition of eukaryotic promoters. Mamm. Genome, 1999, 10: 168-175
    [83] Pedersen AG, Baldi P, Brunak S et al. Characterization of prokaryotic and eukaryotic promoters using hidden Markov models. Proc. Int. Conf. Intell. Syst. Mol. Biol., 1996, 4: 182-191
    [84] Fickett JW, Hatzigeorgiou AG. Eukaryotic promoter recognition. Genome Res., 1997, 7: 861-878
    [85] Bucher P. Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J. Mol. Biol., 1990, 212: 563-578
    [86] Kadesch T. Consequences of heteromeric interactions among helix-loop-helix proteins. Cell Growth Differ., 1993, 4: 49-55
    [87] Hsu HL, Huang L, Tsan JT et al. Preferred sequences for DNA recognition by the TAL1 helix-loop-helix proteins. Mol. Cell. Biol., 1994, 14: 1256-1265
    [88] Berg OG, von Hippel PH. Selection of DNA binding sites by regulatory proteins. Trends Biochem. Sci., 1988, 13: 207-211
    [89] Barrick D, Villaneuba K, Childs J et al. Quantitative analysis of ribosome binding sites in E. coli. Nucleic Acids Res., 1994, 22: 1287-1295
    [90] Andres V, Cervera M, Mahdavi V. Determination of the consensus binding site for MEF2 expressed in muscle and brain reveals tissue-specific sequence constraints. J. Biol. Chem., 1995, 270: 23246-23249
    [91] Fickett JW. Quantitative discrimination of MEF2 sites. Mol. Cell Biol., 1996, 16: 437-441
    [92] Hofmann K, Bucher P. The FHA domain: A putative nuclear signalling domain found in protein kinases and transcription factors. Trends Biochem. Sci., 1995, 20: 347-349
    [93] Claverie JM, Audic S. The statistical significance of nucleotide position-weight matrix matches. Comp. Appl. Biosci., 1996, 12: 431-440
    [94] Brunak S, Engelbrecht J, Knudsen S. Prediction of human mRNA donor and acceptor sites from the DNA sequence. J. Mol. Biol., 1991, 220: 49-65
    [95] Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol., 1997, 268: 79-94
    [96] Bucher P, Trifonov EN. Compilation and analysis of eukaryotic POL II promoter sequences. Nucleic Acids Res., 1986, 14: 10009-10026
    [97] Wingender E, Dietze P, Karas H et al. TRANSFAC: A database on transcription factors and their DNA binding sites. Nucleic Acids Res., 1996, 24: 238-241
    [98] Kel OV, Romachenko AG, Kel AE et al. Structure of data representation in TRRD—Database of transcription regulatory regions on eukaryotic genomes. In Proceedings of the 28th Annual Hawaii International Conference on System Sciences v5, Biotechnology Computing. Los Alamitos: IEEE, Computer Society Press, 1994
    [99] Chen QK, Hertz GZ, Stormo G. MATRIX SEARCH 1.0: A computer program that scans DNA sequences for transcriptional elements using a database of weight matrices. Comp. Appl. Biosci., 1996, 11: 563-566
    [100] Benham CJ. Computation of DNA structural variability—A new predictor of DNA regulatory regions. Comp. Appl. Biosci., 1996, 12: 375-382
    [101] Duret L, Bucher P. Searching for regulatory elements in human noncoding sequences. Curr. Opin. Struct. Biol., 1997, 7: 399-406
    [102] Schwartz S, Zhang Z, Frazer KA et al. PipMaker—a web server for aligning two genomic DNA sequences. Genome Res., 2000, 10: 577-586
    [103] Claverie JM, Sauvaget I, Bougueleret L. K-tuple frequency analysis: From intron/exon discrimination to T-cell epitope mapping. Methods Enzymol., 1990, 183: 237-252
    [104] Fickett JW. Coordinate positioning of MEF2 and myogenin binding sites. Gene, 1996, 172: GC19-GC32
    [105] Quandt K, Frech K, Karas H et al. MatInd and MatInspector—New fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Res., 1995, 23: 4878-4884
    [106] Prestridge DS. Predicting Pol II promoter sequences using transcription factor binding sites. J. Mol. Biol., 1995, 249: 923-932
    [107] Ptitsyn AA, Rogozin IB, Grigorovich DA et al. Computer system "AutoGene" for automatic analysis of nucleotide sequences. Mol. Biol. (Mosk), 1996, 30(2): 432-441
    [108] Ioshikhes IP, Zhang MQ. Large-scale human promoter mapping using CpG islands. Nat. Genet., 2000, 26: 61-63
    [109] Ohler U, Harbeck S, Niemann H. Interpolated Markov chains for eukaryotic promoter recognition. Bioinformatics, 1999, 15(5): 362-369
    [110] Knudsen S. Promoter2.0: for the recognition of PoⅢ promoter sequences. Bioinformatics, 1999, 15(5): 356-361
    [111] Reese MG. Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome. Comput. Chem., 2001, 26(1): 51-56
    [112] Liang GZ, Li ZL. Scores of generalized base properties for quantitative sequence-activity modelings for E.coli promoters based on support vector machine. J. Mol. Graph. Model., 2007, doi:10.1016/j.jmgm.2006.12.004
    [113] Kirchhamer CV, Yuh CH, Davidson EH. Modular cis-regulatory organization of developmentally expressed genes: two genes transcribed territorially in the sea urchin embryo, and additional examples. Proc. Natl. Acad. Sci. U.S.A.,1996, 93: 9322-9328
    [114] Sap J, Munoz A, Schmitt J et al. Repression of transcription mediated at a thyroid homone response element by the v-erb-A oncogene product. Nature, 1989, 340: 242-244
    [115] Bohjanen PR, Liu Y, GarciaBlanco MA. TAR RNA decoys inhibit Tat-activated HIV-1 transcription after preinitiation complex formation. Nucleic Acids Res., 1997, 25: 4481-4486
    [116] Wang WD, Chi TH, Xue YT et al. Architectural DNA binding by a high-mobility-group /kinesin-like subunit in mammalian SWI/SNF-related complexes. Proc. Natl. Acad. Sci. U.S.A., 1998, 95: 492-498
    [117] 姚凤霞, 张瑞芳, 刘春宇等. 真核生物 RNA 聚合酶Ⅱ启动子的计算机预测. 国外医学.遗传学分册, 2005, 28: 6-9
    [118] Kawashima S, Kanehisa M. AAindex: Amino acid index database. Nucleic Acids Res., 2000, 28: 374
    [119] Johnson RA, Wichern DW. Applied multivariate statistical analysis. New Jersey: Prentice Hall, Upper Saddle River, 2002
    [120] Taylor WR. The classication of amino acid conservation. J. Thero. Biol., 1986, 119: 205-218
    [121] Kim D, Lee I-B. Process monitoring based on probabilistic PCA. Chemometr. Intell. Lab. Syst., 2003, 67: 109-123
    [122] Todeschini R, Gramatica P. 3D-modelling and prediction by WHIM descriptors. Part 6. Application of WHIM descriptors in QSAR studies. Quant. Struct.-Act. Relat., 1997, 16: 113-119
    [123] Magnuson VR, Harriss DK, Basak SC. Studies in physical and theoretical chemistry (King, R.B., ed.). Amsterdam: Elsevier (The Netherlands), 1983
    [124] Moran PAP. Note on continuous stochastic phenomena. Biometrika, 1950, 37: 17-23
    [125] Schuur JH, Selzer P, Gasteiger J. The coding of the three-dimensional structure of moleculesby molecular transforms and its application to structure-spectra correlations and studies of biological activity, J. Chem. Inf. Comput. Sci., 1996, 36: 334-344
    [126] Devillers J, Balaban AT. Eds. Topological indices and related descriptors in QSAR and drug design. Amsterdam: Gordon & Breach (The Netherlands), 2000
    [127] Todeschini R, Consonni V. Handbook of molecular descriptors. Weinheim: Wiley-VCH, 2000
    [128] 梁逸曾, 俞汝勤. 化学计量学. 北京: 高等教育出版社, 2003
    [129] Rencher AC, Pun FC. Inflation of R2 in best subset regression. Technometrics, 1980, 22: 49-53
    [130] Ruymgaart FH. A robust principal component analysis. J. Multivariate Anal., 1981, 11: 485-497
    [131] 许禄, 邵学广. 化学计量学方法. 第二版. 北京: 科学出版社. 2004
    [132] Wold S, Sj?str?m M, Eriksson L. PLS-regression: A basic tool of chemometrics. Chemometr. Intell. Lab. Syst., 2001, 58: 109-130
    [133] 王惠文. 偏最小二乘回归方法及其应用. 北京: 国防工业出版社, 1999
    [134] Vapnik V, The nature of statistical learning theory. New York: Springer, 1995
    [135] Myers RH, Montgomery DC. Response surface methodology: Process and product optimization using designed experiments. New York: Wiley, 1995
    [136] Ripley BD. Pattern recognition and neural networks. Cambridge: Cambridge University Press, 1996
    [137] Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating error. Nature, 1986, 323: 533-536
    [138] Leardi R, Gonzáles AL. Genetic algorithms applied to feature selection in PLS regression: How and when to use them. Chemometr. Intell. Lab. Syst., 1998, 41: 195-207
    [139] 侯廷军, 徐筱杰. 遗传算法在计算机辅助药物分子设计中的应用. 化学进展, 2004, 16(1): 35-41
    [140] Hasegawa K, Miyashita Y, Funatsu K. GA strategy for variable selection in QSAR studies: GA based PLS analysis of calcium channel antagonists. J. Chem. Inf. Comput. Sci., 1997, 37: 306-310
    [141] Kaur H, Raghava GPS. A neural network method for prediction of β-turn types in proteins using evolutionary information. Bioinformatics, 2004, 20: 2751-2758
    [142] Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochem. Biophys. Acta, 1975, 405: 442-451
    [143] Deleo JM. Receiver operating characteristic laboratory (ROCLAB): Software for developing decision strategies that account for uncertainty. In: Proceedings of the second international symposium on uncertainty modelling and analysis. College Park, MD: IEEE, Computer Society Press, 1993
    [144] Golbraikh A, Tropsha A. Beware of q2!. J. Mol. Graphics Mod., 2002, 20: 269-276
    [145] Gramatica P, Pilutti P, Papa E. Validated QSAR prediction of OH tropospheric degradation of VOCs: Splitting into training-test sets and consensus modeling. J. Chem. Inf. Comput. Sci., 2004, 44: 1794-1802
    [146] Armas RR, Díaz HG, Molina R et al. Stochastic-based descriptors studying peptides biological properties: modeling the bitter tasting threshold of dipeptides. Bioorg. Med. Chem., 2004, 12: 4815-4822
    [147] Mei H, Liao Z, Zhou Y et al. A new set of amino acid descriptors and its application in peptide QSARs. Biopolymers (Pept. Sci.), 2005, 80: 775-786
    [148] Crackower MA, Sarao R, Oudit GY et al. Angiotensin-converting enzyme 2 is an essential regulator of heart function. Nature, 2002, 417: 822-828
    [149] 梅虎, 周原, 孙立力等. 一种新的氨基酸描述子及其在肽 QSAR 中的应用. 物理化学学报, 2004, 20(8): 821-825
    [150] Sima P, Trebichavsky I, Sigler K. Mammalian antibiotic peptides. Folia Microbiol., 2003, 48: 123-137
    [151] Cherkasov A, Jankovic B. Application of ‘Inductive’ QSAR descriptors for quantification of antibacterial activity of cationic polypeptides. Molecules, 2004, 9: 1034-1052
    [152] Cronin MTD, Aptula AO, Dearden JC et al. Structure-based classification of antibacterial activity. J. Chem. Inf. Comp. Sci., 2002, 42: 869-878
    [153] Molina E, Diaz HG, Gonzalez MP et al. Designing antibacterial compounds through a topological substructural approach. J. Chem. Inf. Comp. Sci., 2004, 44: 515-521
    [154] Hancock RE, Lehrer R. Cationic peptides: A new source of antibiotics. Trends Biotechnol., 1998, 16: 82-88
    [155] Takeshima K, Chikushi A, Lee KK et al. Translocation of analogues of the antimicrobial peptides magainin and buforin across human cell membranes. J. Biol. Chem., 2003, 278: 1310-1315
    [156] Jaen-Oltra J, Salabert-Salvador MT, Garcia-March FJ et al. Artificial neural network applied to prediction of fluorquinolone antibacterial activity by topological methods. J. Med. Chem., 2000, 43: 1143-1148
    [157] Baker MA, Maloy WL, Zasloff M et al. Anticancer efficacy of Magainin2 and analogue peptides. Cancer Res., 1993, 53: 3052-3057
    [158] Epand RM, Vogel HJ. Diversity of antimicrobial peptides and their mechanisms of action. Biochim. Biophys. Acta, 1999, 1462: 11-28
    [159] Farmerie WG, Loeb DD, Casavant NC et al. Expression and processing of the AIDS virus reverse transcriptase in Escherichia coli. Science, 1987, 236(4799): 305-308
    [160] Kohl NE, Emini EA, Schleif WA et al. Active human immunodeficiency virus protease is required for viral infectivity. Proc. Natl. Acad. Sci. U.S.A., 1988, 85: 4686-4690
    [161] You L, Garwicz D, R?gnvaldsson T. Comprehensive bioinformatics analysis of the specificityof human immunodeficiency virus type1 protease. J.Virol., 2005, 79:12477-12486
    [162] Schechter I, Berger A. On the size of the active site in proteases. Biochem. Biophys. Res. Commun., 1967, 27: 157-162
    [163] Chou KC. Review: Prediction of HIV protease cleavage sites in proteins, Anal. Biochem., 1996, 233: 1-14
    [164] Poorman RA, Tomasselli AG, Heinrikson RL et al. A cumulative specificity model for protease from human immunodeficiency virus types 1 and 2, inferred from statistical analysis of an extended substrate data base. J. Biol. Chem., 1991, 22: 14554-14561
    [165] Cai YD, Chou KC. Artificial neural network model for predicting HIV protease cleavage sites in protein. Adv Eng. Software, 1998, 29(2): 119-128
    [166] Narayanan A, Wu X, Yang ZR. Mining viral protease data to extract cleavage knowledge. Bioinformatics, 2002, 18: S5-S13
    [167] Yang ZR, Chou KC. Bio-support vector machines for computational proteomics. Bioinformatics, 2004, 20: 735-741
    [168] Thomson R, Hodgman TC, Yang ZR et al. Characterizing proteolytic cleavage site activity using bio-basis function neural networks. Bioinformatics, 2003, 19: 1741-1747
    [169] Prabu-Jeyabalan M, Nalivaika E, Schiffer CA. Substrate shape determines specificity of recognition for HIV-1 protease: Analysis of crystal structures of six substrate complexes. Structure, 2002, 10: 369-381
    [170] Clemente JC, Moose RE, Hemrajani R et al. Comparing the accumulation of active- and nonactive-site mutations in the HIV-1 protease. Biochemistry, 2004, 43: 12141-12151
    [171] Rammensee HG, Falk K, Rotzschke O. Peptides naturally presented by MHC class I molecules. Annu. Rev. Immunol., 1993, 11: 213-244
    [172] Brusic V, Rudy G, Harrison LC. MHCPEP, a database of MHC-binding peptides: update 1997. Nucleic Acids Res., 1998, 26: 368-371
    [173] Madden DR, Garboczi DN, Wiley DC. The antigenic identity of peptide-MHC complexes: A comparison of the conformations of five viral peptides presented by HLA-A2. Cell, 1993, 75: 693-708
    [174] Brusic V, Bajic VB, Petrovsky N. Computational methods for prediction of T-cell epitopes—a framework for modelling, testing, and applications. Methods, 2004, 34: 436-443
    [175] Stern LJ, Brown JH, Jardetzky TS et al. Crystal structure of the human class II MHC protein HLA-DR1 complexed with an influenza virus peptide. Nature, 1994, 368: 215-221
    [176] Hammer J, Bono E, Gallazzi F et al. Precise prediction of major histocompatibility complex class II-peptide interaction based on peptide side chain scanning. J. Exp. Med., 1994, 180:2353-2358
    [177] Honeyman MC, Brusic V, Stone NL et al. Neural network-based prediction of candidate T-cell epitopes. Nat. Biotechnol., 1998, 16: 966-969
    [178] Bhasin M, Raghava GP. SVM based method for predicting HLA-DRB1*0401 binding peptides in an antigen sequence. Bioinformatics, 2004, 20: 421-423
    [179] Mamitsuka H. Predicting peptides that bind to MHC molecules using supervised learning of hidden Markov models. Proteins: Struct. Funct. Genet., 1998, 33: 460-474
    [180] Wan SZ, Coveney P, Flower DR. Large-scale molecular dynamics simulations of HLA-A*0201 complexed with a tumor-specific antigenic peptide: Can the α3 and β2m domains be neglected?. J. Comput. Chem., 2004, 25: 1803-1813
    [181] Kosmopoulou A, Vlassi M, Stavrakoudis A et al. T-Cell epitopes of the La/SSB autoantigen: Prediction based on the homology modeling of HLA-DQ2/DQ7 with the insulin-B peptide/HLA-DQ8 complex. J. Comput. Chem., 2006, 27: 1033-1044
    [182] Doytchinova IA, Flower DR. Toward the quantitative prediction of T-Cell epitopes: CoMFA and CoMSIA studies of peptides with affinity for the Class I MHC molecule HLA-A*0201. J. Med. Chem., 2001, 44: 3572-3581
    [183] Falk K, Rotzschke O, Stevanovic S et al. Allele specific motifs revealed by sequencing of self-peptides eluted from MHC molecules. Nature, 1991, 351: 290-296
    [184] Madden DR. The three-dimensional structure of peptide-MHC complexes. Annu. Rev. Immunol., 1995, 13: 587-622
    [185] Ruppert J, Sidney J, Celis E et al. Prominent role of secondary anchor residues in peptide binding to HLA-A*0201 molecules. Cell, 1993, 74: 929-937
    [186] Dalgarno DC, Botfield MC, Rickles RJ. SH3 domains and drug design: ligands, structure, and biological function. Biopolymers, 1997, 43: 383-400
    [187] Ren RB, Mayer BJ, Cicchetti P et al. Identification of a 10-amino acid proline-rich SH3 binding-site. Science, 1993, 259: 1157-1161
    [188] Rickles RJ, Botfield MC, Zhou XM et al. Phage display selection of ligand residues important for Src homology 3 domain binding specificity. Proc. Natl. Acad. Sci. U.S.A., 1995, 92: 10909-10913
    [189] Wang W, Lim WA, Jakalian A et al. An analysis of the interactions between the Sem-5 SH3 domain and its ligands using molecular dynamics, free energy calculations, and sequence analysis. J. Am. Chem. Soc., 2001, 123: 3986-3994
    [190] Slepnev VI, Ochoa GC, Butler MH et al. Role of phosphorylation in regulation of the assembly of endocytic coat complexes. Science, 1998, 281: 821-824
    [191] Landgraf C, Panni S, Montecchi-Palazzi L et al. Protein interaction networks by proteome peptide scanning. PLOS Biol., 2004, 2: 94-103
    [192] Hou TJ, McLaughlin W, Lu BZ et al. Prediction of binding affinities between the human amphiphysin-1 SH3 domain and its peptide ligands using homology modeling, molecular dynamics and molecular field analysis. J. Proteome Res., 2006, 5(1): 32-43
    [193] Hou TJ, Li ZM, Li Z et al. Three-dimensional quantitative structure-activity relationship analysis of the new potent sulfonylureas using comparative molecular similarity indices analysis. J. Chem. Inf. Comput. Sci., 2000, 40: 1002-1009
    [194] Murre C, Schonleber MP, Baltimore D. A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless, MyoD, and myc protein. Cell, 1989, 56: 777-783
    [195] Jan YN, Jan LY. HLH proteins, fly neurogenesis, and vertebrate myogenesis. Cell, 1993, 75: 827-830
    [196] Voronova A, Baltimore D. Mutations that disrupt DNA binding and dimer formation in the E47 helix-loop-helix protein map to distinct domains. Proc. Natl. Acad. Sci. U.S.A., 1990, 87: 4722-4726
    [197] Atchley WR, Fitch WM. A natural classification of the basic helix-loop-helix class of transcription factors. Proc. Natl. Acad. Sci. U.S.A., 1997, 94: 5172-5176
    [198] Morgenstern B, Atchley WR. Evolution of bHLH transcription factors: Modular evolution by domain shuffling?. Mol. Biol. Evol., 1999, 16: 1654-1663
    [199] Dang CV, Dolde C, Gillison ML et al. Discrimination between related DNA sites by a single amino acid residue of Myc-related basic-helix-loop-helix proteins. Proc. Natl. Acad. Sci. U.S.A., 1992, 89: 599-602
    [200] Hu YF, Luscher B, Admon A et al. Transcription factor AP-4 contains multiple dimerization domains that regulate dimer specificity. Genes Dev., 1990, 4: 1741-1752
    [201] Swanson HI, Chan WK, Bradfield CA et al. DNA binding specificities and pairing rules of the Ah receptor, ARNT, and SIM proteins. J. Biol. Chem., 1995, 270: 26292-26302
    [202] Massari ME, Murre C. Helix-loop-helix proteins: Regulators of transcription in eucaryotic organisms. Mol. Cell Biol., 2000, 20(2): 429-440
    [203] Ledent V, Vervoort M. The basic helix-loop-helix protein family: Comparative genomics and phylogenetic analysis. Genome Res., 2001, 11: 754-770
    [204] Atchley WR, Terhalle W, Dress A. Positional dependence, cliques, and predictive motifs in the bHLH protein domain. J. Mol. Evol., 1999, 48: 501-516
    [205] Chou KC. Prediction of tight turns and their types in proteins. Anal. Biochem., 2000, 286: 1-16
    [206] Zhang CT, Chou KC. Prediction of beta-turns in proteins by 1–4 & 2–3 Correlation Model. Biopolymers, 1997, 41: 673-702
    [207] Chou KC. Prediction of beta-turns. J. Pept. Res., 1997, 49: 120-144
    [208] Wilmot CM, Thornton JM. Beta-turns and their distortions: A proposed new nomenclature. Protein Eng., 1990, 3: 479-493
    [209] Shepherd AJ, Gorse D, Thornton JM. Prediction of the location and type of beta-turns in proteins using neural networks. Protein Sci., 1999, 8: 1045-1055
    [210] Kaur H, Raghava GP. An evaluation of beta-turn prediction methods. Bioinformatics, 2002, 18: 1508-1514
    [211] Kaur H, Raghava GP. Prediction of β-turns in proteins from multiple alignment using neural network. Protein Sci., 2003, 12: 627-634
    [212] Kim S. Protein β-turn prediction using nearest-neighbor method. Bioinformatics, 2004, 20: 40-44
    [213] Zhang QD, Yoon S, Welsh WJ. Improved method for predicting β-turn using support vector machine. Bioinformatics, 2005, 21: 2370-2374
    [214] Guruprasad K, Rajkumar S. Beta- and gamma-turns in proteins revisited: a new set of amino acid turn-type dependent positional preferences and potentials. J. Biosci., 2000, 25: 143-156
    [215] Hutchinson EG, Thornton JM. PROMOTIF—a program to identify and analyze structural motifs in proteins. Protein Sci., 1996, 5: 212-220
    [216] Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol., 1999, 292: 195-202
    [217] Gokhale RS, Khosla C. Role of linkers in communication between protein modules. Curr. Opin. Chem. Biol., 2000, 4: 22-27
    [218] George RA, Heringa J. An analysis of protein domain linkers: Their classification and role in protein folding. Protein Eng., 2002, 15: 871-879
    [219] Attwood TK, Croning MD, Gaulton A. Deriving structural and functional insights from a ligand-based hierarchical classification of G protein-coupled receptors. Protein Eng., 2002, 15: 7-12
    [220] Horn F, Weare J, Beukers MW et al. GPCRDB: An information system for G protein-coupled receptors. Nucleic Acids Res., 1998, 26: 275-279
    [221] Drews J. Genomic sciences and the medicine of tomorrow. Nat. Biotechnol., 1996, 14: 1516-1518
    [222] Yin YB, Luo JC, Jiang Y. Advances in G-protein coupled receptor research and related bioinformatics study. Chin. Sci. Bull., 2003, 48 (6): 511-516
    [223] Altschul SF, Madden TL, Schaffer AA et al. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res., 1997, 25: 3389-3402
    [224] Chou KC. Prediction of G-protein-coupled receptor classes. J. Proteome Res., 2005, 4: 413-1418
    [225] Papasaikas PK, Bagos PG, Litou ZI et al. PRED-GPCR: GPCR recognition and family classification server. Nucleic Acids Res., 2004, 32: W380-W382
    [226] Huang Y, Cai J, Ji L et al. Classifying G-protein coupled receptors with bagging classification tree. Comput. Biol. Chem., 2004, 28: 39-49
    [227] Karchin R, Karplus K, Haussler D. Classifying G-protein coupled receptors with support vector machines. Bioinformatics, 2002, 18: 147-159
    [228] Bhasin M, Raghava GPS. Gpcrpred: An svm-based method for prediction of families and subfamilies of G-protein coupled receptors. Nucleic Acids Res., 2004, 32: W383-W389
    [229] Gaulton A, Attwood TK. Bioinformatics approaches for the classification of G-protein-coupled receptors. Curr. Opin. Pharmacol., 2003, 3: 114-120
    [230] Wold S, Jonsson J, Sj?str?m M et al. DNA and peptide sequences and chemical processes mutlivariately modelled by principal component analysis and partial least squares projections to latent structures. Anal. Chim. Acta., 1993, 277: 239-253
    [231] Zhao XM, Huang DS, Zhang SW et al. Classifying G-protein coupled receptors with hydropathy blocks and support vector machines. LNBI, 2006, 4115: 593-602
    [232] Fouchier RA, Munster V, Wallensten A et al. Characterization of a novel influenza A virus hemagglutinin subtype (H16) obtained from black-headed gulls. J. Virol., 2005, 79: 2814-2822
    [233] Stevens J, Blixt O, Tumpey TM et al. Structure and receptor specificity of the hemagglutinin from an H5N1 influenza virus. Science, 2006, 312: 404-410
    [234] Beigel JH, Farrar J, Han AM et al. The writing committee of the World Health Organization (WHO) consultation on human influenza A/H5. Avian influenza A (H5N1) infection in humans. N. Eng. J. Med., 2005, 353: 1374-1385
    [235] Garten W, Klenk HD. Understanding influenza virus pathogenicity. Trends Microbiol., 1999, 7(3): 99-100
    [236] St?hr K. Avian influenza and pandemics—Research needs and opportunities. N. Eng. J. Med., 2005, 352: 405-407
    [237] Haria C, Dennis JA. Avian influenza and human health. Acta Trop., 2002, 83: 1-6
    [238] World Health Organization. Recommended laboratory tests to identify influenza A/H5 virus in specimens from patients with an influenza-like illness. 2005. (Accessed September 2, 2005, at http://www.who.int/csr/disease/avian_influenza/guidelines/avian_labtests1.pdf.)
    [239] Council of the European Communities. Council directive 92/40/EEC of 19th May 1992 introducing community measures for the control of avian influenza. Off J European Communities, 1992, L167: 1-15
    [240] Marr MT, Roberts JW. Promoter recognition as measured by binding of polymerase to nontemplate strand oligonucleotide. Science, 1997, 276: 1258-1260
    [241] Aguiar PF, Bourguignon B, Khots MS et al. Tutorial D-optimal designs. Chemometr. Intell. Lab. Syst., 1995, 30: 199-210
    [242] Liu HX, Zhang RS, Yao XJ et al. Prediction of the isoelectric point of an amino acid based on GA-PLS and SVMs. J. Chem. Inf. Comput. Sci., 2004, 44: 161-167
    [243] Prestridge DS. Computer software for eukaryotic promoter analysis. Methods Mol. Biol., 2000, 130: 265-295
    [244] Bajic VB, Tan SL, Suzuki Y et al. Promoter prediction analysis on the whole human genome. Nat. Biotechnol., 2004, 22: 1467-1473
    [245] Vanet A, Marsanc L, Sagot M. Promoter sequences and algorithmical methods for identifying them. Res. Microbiol., 1999, 150(9-10): 779-799
    [246] Reese MG, Kulp D, Tammana H et al. Genie-Gene finding in Drosophila melanogaster. Genome Res., 2000, 10(4): 529-538
    [247] Bajic VB, Seah SH, Chong A et al. Computer model for recognition of functional transcription start sites in RNA polymerase II promoters of vertebrates. J. Mol. Graph. Model., 2003, 21: 323-332
    [248] Davuluri RV, Grosse I, Zhang MQ. Computational identification of promoters and first exons in the human genome. Nat. Genet., 2001, 29: 412-417
    [249] Halees AS, Leyfer D, Weng ZP. Promoser: A larger-scale mammalian promoter and transcription start site identification service. Nucleic Acids Res., 2003, 31: 3554-3559

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700