用户名: 密码: 验证码:
家蚕基因组数据库的构建及应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
家蚕是重要的经济昆虫,也是鳞翅目昆虫的典型代表。实施家蚕基因组计划具有重要意义,一方面可以促进家蚕的生理、生化、代谢等方面的基础研究,阐明与茧丝形成相关的分子机制,为利用现代生物技术手段改造传统蚕丝产业打下基础;另一方面也可以为农林业害虫的防治提供新的思路和方法。另外,以家蚕作为生物反应器的研究也受到人们的关注。
     在人类基因组计划和其它模式生物基因组计划的带动下,2003年完成了家蚕全基因组测序,由中国和日本分别获得了序列覆盖度为6倍和3倍的基因组框架图。然而,该基因组框架图覆盖基因组还不完整、部分基因也只是碎片。为了获得更高质量的家蚕基因组序列图谱,中国和日本就构建家蚕基因组精细图谱开展了合作,相互交换测序数据,共同填补基因组空缺和开发分子标记,最后对家蚕基因组数据进行统一的拼接组装,于2007年共同完成了家蚕基因组精细图。
     高质量的家蚕基因组精细图的完成,为下一步的家蚕基因功能研究提供了良好的基础。然而,如何让研究者能方便地访问到基因组精细图的数据资源,从而获得对基因功能研究有用的信息或线索,是目前迫切需要解决的问题。针对这一问题,本文采用了多种生物信息学方法对家蚕基因的功能进行了注释,并利用基因芯片数据分析了家蚕基因的表达情况,在整理这些数据及基因组精细图相关数据的基础上,成功构建了一个信息含量丰富、使用方便、分析工具齐全的家蚕基因组数据库。基于家蚕基因组精细图数据和基因组数据库,本文还对家蚕C2H2型锌指蛋白基因进行了鉴定和分析。获得的主要结果如下:
     1.家蚕基因功能注释
     本研究采用了多种生物信息学方法对家蚕基因的功能进行了预测,这些信息能为下一步的基因功能研究提供线索。
     (1)序列相似性检索:根据序列上相似的基因可能在功能上也相似的原理,将家蚕的14623个预测基因到nr非冗余蛋白质序列数据库中进行序列相似性检索,以获得基因功能方面的提示。结果表明,有12246个基因能检索到相似基因(E-value<1E-5),占了家蚕基因总数的83.7%。其中有5250个基因高度保守(E-value<1E-80),分析显示,这些基因与DNA复制、能量代谢、蛋白质合成、脂类代谢、糖代谢等基础生理代谢过程相关。另外,还有2377个基因没有检索到相似基因,表明这些基因属于家蚕特异基因,推测它们可能与家蚕特异的生理代谢过程相关。
     (2)蛋白质结构域分析:所谓基因的功能,主要指其编码的蛋白质的功能,其中,蛋白质结构域起着重要的作用。因此,分析基因中的蛋白质结构域可以为我们了解基因的功能提供重要的线索。利用InterPro数据库对家蚕基因中的蛋白质结构域进行分析,结果显示,有8522个基因具有蛋白质结构域,占了家蚕全部基因的58.2%。在这些基因中,总共含有2509种不同类型的蛋白质结构域,数量最多的结构域是C2H2、LRR_1、WD40、Ank、I-set等。根据蛋白质结构域的信息来预测基因功能,一方面能弥补序列相似性检索的不足,有79个基因在序列相似性检索的时候没能获得功能注释,但是利用蛋白质结构域信息却获得了基因的功能信息;另一方面,针对具有多个蛋白质结构域的基因,利用蛋白质结构域的信息可以更全面地反映基因的功能。
     (3)基于直系同源基因数据库的分析:COG直系同源基因数据库是一个存储不同物种中直系同源基因的数据库。基于该数据库,对家蚕的基因进行了分析。结果显示,总共有7839个基因能被划分到相应的直系同源基因簇中(E-value<1E-5)。其中,基因分布比较多的功能类别有基础功能簇、信号传导机制簇、翻译后修饰簇、蛋白更新和分子伴侣簇、脂质转运和代谢簇等,分别含有1602、987、593、436、391个家蚕基因。另外,我们还利用COG数据库中的物种特异基因库对家蚕的基因进行了分析。结果显示,有533个基因能划分到相应的物种特异基因簇中(1E-5)。其中有475个基因属于家蚕和果蝇的特异基因簇,说明这些基因都属于昆虫特异基因,它们可能与昆虫特异的生理代谢过程密切相关。
     结合以上三种方法,有6580个基因在三种方法中都能获得注释。这几种方法有各自的优缺点,通过将多种方法结合,可以更全面地反映基因的功能信息。
     2.家蚕基因芯片数据分析和数据库构建
     基因在不同时间和空间的表达差异决定着生物体的发育、分化、细胞周期调控、衰老、程序性死亡等生理代谢过程。为了在全基因组水平上获得基因表达信息,本实验室与生物芯片北京国家工程中心合作,共同定制完成了世界上第一张家蚕全基因组寡核苷酸基因芯片,并利用该芯片对家蚕5龄第3天幼虫的中肠、体壁、头、血液、精巢、卵巢、前中部丝腺、后部丝腺、脂肪体、马氏管这10个组织(或样品)中的基因表达谱进行了检测。本研究对获得的基因芯片数据进行了分析和可靠性检验,并构建了家蚕芯片数据库来提供这些数据的对外访问。
     基于基因芯片数据,分析发现,总共有10393个基因能检测到表达(即至少在一个组织中表达),占了芯片中全部基因总数的44.5%。其中有306个基因在每个组织中都高量表达,分析显示,这些基因中很多都属于看家基因,如核糖体蛋白基因、微管蛋白基因、翻译延长因子基因、肌动蛋白基因等。对组织特异表达基因进行分析,发现至少有1642个基因存在组织特异性表达,这些基因在精巢、中肠、马氏管等组织中分布较多,分别拥有1104、216、110个组织特异表达基因。结合基因的功能注释信息进行分析,结果显示,组织特异表达基因与组织的生理功能密切相关。对只在两个组织间共表达的家蚕基因进行了鉴定分析,发现至少有209个基因只在两个组织间共表达。分析显示,这些基因反映了组织间存在相似的生理功能或细胞组分。
     为了评估芯片数据的可靠性,我们采用了信息分析、实验验证等多种方法进行评估,结果都表明芯片数据是可靠的,说明我们的数据分析过程也是准确的。在整理基因芯片表达数据的基础上,我们成功地构建了家蚕芯片数据库BmMDB(http://silkworm.swu.edu.cn/microarray),使人们可以方便地访问到家蚕基因芯片数据。
     3.家蚕基因组数据库构建
     随着家蚕基因组精细图的完成,基因组的质量显著提高,获得了高质量的预测基因集,并且有87.4%的基因组序列能定位到染色体上。为了提供对精细图数据资源的访问,并提供更多的综合信息,本研究在整理数据资源的基础上重新构建了家蚕基因组数据库。新数据库的访问地址为:http://silkworm.swu.edu.cn/silkdb或http://silkworm.genomics.org.cn.
     在新的数据库中,采用了GBrowse基因组浏览器进行信息导航,替代了以前数据库中的MapView浏览器。GBrowse浏览器是一个国际上通用的基因组浏览器,基于GBrowse浏览器,可以方便地访问家蚕基因组上任意感兴趣的区域。同时,数据库中提供了多种检索方式,可以通过关键词、基因编号等进行检索,也可以通过网站中的BLAST工具对家蚕的ESTs序列、基因组序列、基因序列等进行序列相似性检索。基于基因组序列的染色体信息,我们还开发了家蚕染色体浏览器SCB和家蚕染色体定位工具SilkMap,方便对家蚕数据资源的使用。
     基因页面是家蚕基因组数据库的核心,在基因页面中,可以显示基因的详细信息,如蛋白质结构域信息、GO分类信息、基因相似性注释信息、基因家族信息、基因表达信息、参考文献信息、基因序列等,这些信息能为进一步的基因功能研究提供重要参考和线索。
     数据库中也提供了一些常用的在线分析工具,有利于对家蚕基因组数据进行信息分析。下一步,我们将逐步校正现有数据库中有问题的数据,并整理和添加更多的实验研究数据,如基因的SAGE信息、SNP信息、基因RNAi干涉后的表型信息、基因突变表型信息等,使数据库的信息含量更丰富。总之,家蚕基因组数据库的构建,必将在加速家蚕基因功能研究中发挥重要的作用。
     4.家蚕C2H2型锌指蛋白的鉴定
     C2H2型锌指具有DNA结合特异性,含有这类结构域的基因被称为C2H2型锌指蛋白基因,能在胚胎发育、细胞分化、变态等过程中起关键性的调控作用。利用Pfam数据库中C2H2型锌指的HMM模型(PF00096),对家蚕基因组中的C2H2型锌指蛋白基因进行了鉴定。结果显示,家蚕基因组中至少存在338个C2H2型锌指蛋白基因,占了全部基因总数的2.3%。与果蝇相比,家蚕的C2H2型锌指蛋白基因的数量和C2H2型锌指结构域的数量都显著增加。分析发现,家蚕中增加的这部分基因主要为含有10个以上C2H2型锌指的基因。
     除了具有锌指结构域外,部分锌指蛋白还含有其它类型的结构域,被称为锌指偶联结构域,它们能辅助锌指蛋白激活或抑制目标基因表达。分析发现,家蚕中有90个C2H2型锌指蛋白具有锌指偶联结构域。统计显示,家蚕中数量最多的锌指偶联结构域是ZAD结构域,总共有50个ZAD结构域分布于50个家蚕基因中。与其它物种相比,线虫中没有ZAD结构域,人类基因组中也仅有1个该类结构域,而果蝇中含有87个ZAD结构域,说明ZAD结构域的数量在昆虫中特异性地增多了,推测含有ZAD结构域的基因的功能可与昆虫特异的生理代谢过程相关。
     对家蚕C2H2型锌指蛋白基因在基因组上的分布进行分析,结果显示,有324个基因能定位到染色体上。以“邻近基因间的距离小于500kb”作为串联重复基因的标准,有241个基因能分布到59个串联重复基因簇。最大的一个簇位于第24号染色体上,在650kb的区域内分布了43个C2H2型锌指蛋白基因。大部分基因在染色体上成簇排列,说明串联重复在增加家蚕C2H2型锌指蛋白基因的数量方面起了重要作用。同时,串联重复也导致不同染色体上分布的基因数量极不均匀,家蚕中主要的C2H2型锌指蛋白基因集中在第11号染色体、第15号染色体和第24号染色体上,占了全部C2H2型锌指蛋白基因的38.8%。
     基因家族分析有助于获得基因的功能线索。结合人、线虫、果蝇的基因,比较分析表明,家蚕的C2H2型锌指蛋白基因能划分为75个基因家族,其中有63个基因家族在进化上是保守的,即家族中至少有一个成员来自于线虫、果蝇或者人。在保守的基因家族中,有32个基因家族的成员只来源于果蝇和家蚕,说明这些家族属于昆虫特异的基因家族。在家蚕中,有12个家族属于家蚕特异的基因家族,再加上特异的单拷贝基因,家蚕中特异的C2H2型锌指蛋白基因的数量共有188个,相比线虫、果蝇、人分别只有120、125、160个特异的锌指蛋白基因,家蚕中特异的锌指蛋白基因数量显著增多。家蚕在吐丝、变态发育等方面具有独特的生物学过程,对这些家蚕特异的锌指蛋白基因的进一步功能研究,可能揭示出这些基因与家蚕特异生物学过程的联系。
     5龄第3天是家蚕幼虫发育中最为重要的一个时期,此时家蚕开始丝蛋白的大量合成,并为变态作准备。我们利用芯片数据对家蚕C2H2型锌指蛋白基因在该时期不同组织中的表达情况进行了分析。结果显示,有132个基因在这个时期有表达,其中有33个基因在每个组织中都表达,14个基因表现出组织特异性表达。分析发现,表达的C2H2型锌指蛋白基因可能在这个时期发挥着十分重要的作用。例如,对于在所有组织中都表达的基因,BmZFP286基因属于DNJA5家族,可能与这个时期的蛋白折叠有关;BmZFP104属于Ab家族,推测该基因可能的作用是协调这个时期组织或器官的运动;BmZFP160与果蝇的crol基因在序列上高度相似,推测该基因可能已经受到蜕皮激素的诱导,并可能是一个蜕皮激素诱导的早期应答基因。
     综上所述,本研究对家蚕基因组中的C2H2型锌指蛋白基因进行了鉴定,并通过分析获得了这些基因的基本信息,即染色体分布、基因家族信息和基因表达信息等,这些信息为进一步对家蚕C2H2型锌指蛋白基因进行功能研究打下了基础。
Silkworm,Bombyx mori,is an economically important insect and a model for Lepidopteran insects.The implementation of silkworm genome sequence project will benefit many aspects.On one hand,it will facilitate the basic research of silkworm in physiology,biochemistry and metabolism,and reveal the molecular mechanism about silk production.These advances will provide foundations for reforming traditional sericulture through modern biotechnology.On the other hand, it will also provide the new methods for controlling pests in agriculture and forest.In addition, researchers also give attention to the study that regards silkworm as biological reactor.
     Following the human genome project and other genome projects for model organisms,a 3x and a 6x draft genome sequences of silkworm have been completed by Japanese group and Chinese group,respectively.However,the draft genome sequences are not intact,and a part of genes are fragmentary.In order to acquire a relative complete genome sequence,Japanese group and Chinese group cooperated together,and focused on exchanging their data,filled up genomic gaps and developed more molecular markers.In 2007,a fine genome sequence of silkworm was successfully assembled.
     The completion of the fine assembly of the silkworm genome sequence lays a foundation for studying function of genes.However,the main problem that people faced is that how to conveniently acquire the genome data to obtain the information or clue related to genes.To solve this problem,we predicted the functions of all the silkworm genes in many ways and obtained gene expression profiles by microarmy.Based on these data and other related data,we successfully reconstructed a new silkworm genome database.This database integrated various data resources and bioinformatics tools and will provide a useful platform for silkworm and Lepidopteran insect research community.In addition,the C2H2 zinc-finger protein genes were identified based on silkworm fine genome sequence and database.The main results are as follows.
     1.Annotation of function of genes in silkworm
     The functions of genes were predicted by a variety of methods.The information will provide some important clues for further studying gene function.
     (1)Gene function predicted by sequence similarity searches:This method is mainly based on that similar sequences often share with similar functions.A total of 14623 genes were used to query against the nr database in NCBI.As a result,12246 genes have homologs(E-value<1E-5),which account for 83.7%of all genes.Of which,there are 5250 genes that are highly conserved (E-value<1E-80).These conserved genes are involved in DNA replication,energy metabolism, protein synthesis,lipid metabolism,metabolism of carbohydrate and other basic physiological processes.In addition,there are 2377 genes that have no homolog.It is likely that these genes belong to silkworm-specific genes,so we suppose that these genes may be related to some special physiological functions of silkworm.
     (2)Gene function predicted by protein domains.The function of gene mainly is referred to as the function of protein that it is encoded.Protein domains play very important role when a gene works.Thus,the information of protein domains in genes will give clues for gene function.All the silkworm genes were used to query against the InterPro database.As a result,there are 8522 genes that have domains,which account for 58.2%of all genes.There are 2509 kinds of domains in sum, and the most prevailing domains are C2H2,LRR_1、WD40、Ank、I-set and so on.On the one hand, this work can complement the shortage of sequence similarity search.78 genes were annotated by this way but wasn't by the homology search.On the other hand,the gene function obtained by domains may give more comprehensive information,especially for the genes that contain multi-domain.
     (3)Gene function prediction based on the database of COG:Gene function was predicted based on the database of COG(Clusters of Orthologous Groups).The result showed that 7839 genes could be classified to the corresponding orthologous clusters(E-value<1E-5).Among them,gene groups related to "General function prediction only","signal transduction mechanisms","Posttranslational modification","protein turnover and chaperones" and "Lipid transport and metabolism" are most enriched clusters;they include 1602 genes,987 genes,593 genes,436 genes,and 391 genes, respectively.In addition,we also annotated the genes through the database of LSE(Lineage Specific Expansion).The result showed that 533 genes could be classified to the corresponding LSE clusters (E-value<1E-5).Among them,there are 475 genes of silkworm belong to Drosophila lineage-specific gene groups.This indicated that these genes are insect-specific genes,and may be related to insect specific physiology.
     Comparison analysis showed that there are 6580 genes that can be annotated by the above three methods.These methods have the relative merits respectively.Combination of the above methods could reflect the function of genes more comprehensively.
     2.Analysis of microarray data of silkworm and database construction
     Spatio-temporal expression of genes could control development,differentiation,cell cycle, senescence,programmed cell death and other processes.To acquire more information about gene expression at the whole-genome scale,our lab and National Engineering Center for Beijing Biochip Technology worked together and constructed the first genome-wide oligonucleotide microarray of the silkworm.This microarray was used to detect gene expression of 10 tissues/organs,including midgut,integument,head,hemocyte,testis,ovary,anterior/median silk gland(A/MSG),posterior silk gland(PSG),fat body and malpighian tubule.In this study,we performed analysis of these microarray data and confirmed the reliability of the results.We made the silkworm microarray data to be accessed for public through our database.
     Analysis of microarray data showed that 10393 genes were detected at least in one tissue,which account for 44.5%of all genes.There are 306 genes that express abundantly in each tissue,and most of these genes belong to house-keeping genes,such as ribosomal protein genes,tubulin genes, translation elongation factor genes,actin genes and so on.Data analysis showed that there are at least1642 gene belonged to tissue-specific expression genes.Most of these genes are distributed in testis,midgut and malpighian tubule,which have 1104,216 and 110 tissue-specific genes, respectively.Based on the annotation of genes,it is shown that these tissue-specific genes are related to the function of corresponding tissue.Data analysis also suggested that at least 209 genes belong to co-expression genes between tissues.These genes reflect that there is similar physiological function or cellar component between tissues.
     We performed many analyses with different approaches,such as bioinformatics and experiemnt, to validate the reliability of microarray data.The results suggest that these microarray data are reliable and analysis methods used in the study are proper.At last,we constructed a silkworm microarry database of BmMDB based on microarray data so that expression profile of silkworm genes could be conveniently accessed.The information about gene expression will facilitate further study for functions of genes.
     3.Construction of silkworm genome database
     After the completion of the updated assembly of the silkworm genome sequence,the quality of genome sequence has been greatly improved and genes were more accurately predicted.At the same time 87.4%of genomic sequence could be anchored on the chromosomes.In order to provide the accession for these data and integrate more information,we reconstructed silkworm genome database.The new database of SilkDB can be accessed at http://silkworm.swu.edu.cn/silkdb or http://silkworm.genomics.org.cn.
     In the new database,the information is navigated by genome browse of GBrowse insteading Mapview that was used in previous database.Based on GBrowse,users could access any region on the genome.The database also provides a variety of search methods.One way is keywords,or gene ID and so on.Another way is homologous search by using the tool of BLAST to search against ESTs sequence,genome sequence and gene sequence.In addition,we have developed Silkworm Chromosomes Browser(SCB)and SilkMap to make it easy for people to visit the resources of silkworm data.
     The Gene Page is the heart of silkworm database.The gene page could display the detailed gene information,such as domain information,GO classification information,annotation of homology searches,gene family,gene expression information,reference information,gene sequence, and so on.This information is the base of the further research for gene function.
     The new information could be easily added in the current database when it appears.In the next step,we will curate the error data in the database and add more experiment data,such as expression information obtained from SAGE,the phenotype that resulted from RNA interference(RNAi)or gene mutation.In a word,the construction of silkworm genome database will play an important role in accelerating the research of functional studies of genes in silkworm.
     4.Identification of C2H2 ZFPs in the silkworm
     The C2H2 zinc-finger domain has the character of sequence-specific DNA binding.Proteins that contained this domain are called as C2H2 zinc-finger proteins(ZFPs).Generally,most C2H2 ZFPs could function as sequence-specific DNA-binding transcription factors,and play important roles in the process of development,cell differentiation,metamorphism,and so on.By searching the silkworm genome with a HMM model of C2H2 zinc-fingers(PF00096),we have systematically identified 338 C2H2 ZFP genes in silkworm genome,which constitute 2.3%of the annotated genes. Compared to Drosophila,silkworm has significantly more C2H2 ZFP genes and C2H2 zinc-fingers. Further study showed that silkworm has more genes that contain 10 zinc-fingers per gene.
     C2H2 ZFPs often have other domains other than C2H2 zinc-finger.These domains are named as zinc-finger associated domain,and may assist ZFPs in activating or repressing expression of target genes.In Bombyx mori,there are 90 genes with zinc-finger associated domains.Of them, ZAD is the most prevalent domain,and there are 50 ZADs in 50 genes.Comparative analysis showed that there is no ZAD in the Caenorhabditis elegans,and only one ZAD in the human.This result indicated that ZAD is one domain that has been lineage-specifically expanded in insects.We speculated that ZAD domain may be related to some special physiologic or metabolic processes in insects.
     The distribution of C2H2 ZFP genes in the genome was investigated.The results showed that a total of 324 C2H2 ZFP genes could be located on chromosomes.About 241 genes are concentrated into 59 tandem duplication clusters(threshold sets as 500kb for neighboring genes).The largest cluster was located on chromosome 24,which consists of 43 C2H2 ZFP genes in 650kb fragment. Most of the ZFP genes are tandem clustered on chromosomes,indicating that tandem duplication plays an important role in expanding the number of these genes.At the same time,the cluster organization also results in an asymmetric distribution of these genes on different chromosomes. Most of C2H2 ZFP genes are concentrated on chromosome 11,15 and 24,and the sum of these genes account for 38.8%of the total C2H2 ZFP genes.
     The information of gene families is helpful to understand the function of genes.Compared with the C2H2 ZFPs of H.sapiens,C.elegans and D.melanogaster,silkworm C2H2 ZFPs were classified into 75 gene families,and 63 of which belong to evolutionarily conservative families,e.g. they have members from D.melanogaster,C.elegans or H.sapiens.In the evolutionarily conservative families,there are 32 families that have members only from silkworm and D. melanogaster,this indicated that these genes belong to insect specific genes.In addition,there are 12 families that appear only in silkworm.Considering the singleton genes,there are 188 silkworm species-specific C2H2 ZFP genes.However C.elegans,D.rnelanogaster and H.sapiens have only 120,125 and 160 species-specific genes,respectively.This suggests that silkworm has more species-specific C2H2 ZFP genes than other organisms.Silkworm has the character of silk production and metamorphism,so the further studies of these genes may uncover the relationship between these silkworm species-specific genes and the species-specific biological processes.
     Day 3 of fifth instar is an important stage for silk protein synthesis and preparation for metamorphism.In this study,we examined the expression patterns of the silkworm C2H2 ZFP genes in different tissues based on microarray data.As a result,a total of 132 C2H2 ZFP genes were detected that express at least in one of the investigated tissues.Of which,33 genes express in every tissue,and 14 genes express exclusively in one investigated tissue.The results indicted that these genes may play important roles in this stage.For example,for the genes that expressed in all the tissues,BmZFP286 belong to DNJA5 family,so this gene may be related to the protein fold of this stage.BmZFP104 belong to Ab family,and we speculated that this gene may have the function of coordinate the movement of tissue or organs at this stage.BmZFP160 shared high similarity with Drosophila crol.We speculated that this gene may be an early response gene for ecdysone,and it is likely that this gene has been induced by ecdysone.
     In this study,we have identified C2H2 ZFP genes in the silkworm,and acquired the basic information for these genes,such as gene distribution on chromosome,gene family information and expression information.This information will be useful for further functional studies on these genes.
引文
[1] Watson JD: The human genome project: past, present, and future. Science 1990, 248(4951):44-49.
    [2] Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR et al: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 1995, 269(5223):496-512.
    [3] Goffeau A, Barrell BG, Bussey H, Davis RW, Dujon B, Feldmann H et al: Life with 6000 genes. Science 1996,274(5287):546, 563-547.
    [4] Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M et al: The complete genome sequence of Escherichia coli K-12. Science 1997,277(5331):1453-1474.
    [5] Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 1998, 282(5396):2012-2018.
    [6] Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG et al: The genome sequence of Drosophila melanogaster. Science 2000,287(5461):2185-2195.
    [7] Goldsmith MR, Shimada T, Abe H: The genetics and genomics of the silkworm, Bombyx mori. Annu Rev Entomol 2005, 50:71-100.
    [8] Dulbecco R: A turning point in cancer research: sequencing the human genome. Science 1986, 231(4742): 1055-1056.
    [9] Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG et al: The sequence of the human genome. Science 2001, 291(5507):1304-1351.
    [10] Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P et al: Initial sequencing and comparative analysis of the mouse genome. Nature 2002,420(6915):520-562.
    
    [11] Finishing the euchromatic sequence of the human genome. Nature 2004,431 (7011):931 -945.
    [12] Gregory SG, Barlow KF, McLay KE, Kaul R, Swarbreck D, Dunham A et al: The DNA sequence and biological annotation of human chromosome 1. Nature 2006,441(7091):315-321.
    [13] Grimmelikhuijzen CJ, Cazzamali G, Williamson M, Hauser F: The promise of insect genomics. Pest Manag Sci 2007, 63(5):413-416.
    [14] Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, Nielsen R et al: Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. Genome Res 2005, 15(1):1-18.
    [15] Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, Markow TA et al: Evolution of genes and genomes on the Drosophila phylogeny. Nature 2007,450(7167):203-218.
    [16] Holt RA, Subramanian GM, Halpem A, Sutton GG, Charlab R, Nusskern DR et al: The genome sequence of the malaria mosquito Anopheles gambiae.Science 2002,298(5591):129-149.
    [17]Nene V,Wortman JR,Lawson D,Haas B,Kodira C,Tu ZJ et al:Genome sequence of Aedes aegypti,a major arbovirus vector.Science 2007,316(5832):1718-1723.
    [18]Xia Q,Zhou Z,Lu C,Cheng D,Dai F,Li B et al:A draft sequence for the genome of the domesticated silkworm(Bombyx mori).Science 2004,306(5703):1937-1940.
    [19]Insights into social insects from the genome of the honeybee Apis mellifera.Nature 2006,443(7114):931-949.
    [20]Richards S,Gibbs RA,Weinstock GM,Brown S J,Denell R,Beeman RW et al:The genome of the model beetle and pest Tribolium castaneum.Nature 2008,452(7190):949-955.
    [21]Phan JH,Quo CF,Wang MD:Functional genomics and proteomics in the clinical neurosciences:data mining and bioinformatics.Prog Brain Res 2006,158:83-108.
    [22]Lamitina T:Functional genomic approaches in C.elegans.Methods Mol Biol 2006,351:127-138.
    [23]樊龙江:生物信息学札记.生物信息学札记(第2版)
    [24]Apweiler R,Attwood TK,Bairoch A,Bateman A,Birney E,Biswas M et al:The lnterPro database,an integrated documentation resource for protein families,domains and functional sites.Nucleic Acids Res 2001,29(1):37-40.
    [25]Berglund AC,Sjolund E,Ostlund G,Sonnhammer EL:InParanoid 6:eukaryotic ortholog clusters with inparalogs.Nucleic Acids Res 2008,36(Database issue):D263-266.
    [26]Tatusov RL,Galperin MY,Natale DA,Koonin EV:The COG database:a tool for genome-scale analysis of protein functions and evolution.Nucleic Acids Res 2000,28(1):33-36.
    [27]Adams MD,Kelley JM,Gocayne JD,Dubnick M,Polymeropoulos MH,Xiao H et al:Complementary DNA sequencing:expressed sequence tags and human genome project.Science 1991,252(5013):1651-1656.
    [28]Velculescu VE,Zhang L,Vogelstein B,Kinzler KW:Serial analysis of gene expression.Science 1995,270(5235):484-487.
    [29]Barinaga M:Will "DNA chip" speed genome initiative? Science 1991,253(5027):1489.
    [30]Wang ZL,Li J,Xia QY,Zhao P,Duan J,Zha XF et al:Identification and expression pattern of Bmlark,a homolog of the Drosophila gene lark in Bombyx moil.DNA Seq 2005,16(3):224-229.
    [31]Tuteja R,Tuteja N:Serial Analysis of Gene Expression:Applications in Malaria Parasite,Yeast,Plant,and Animal Studies.J Biomed Biotechnol 2004,2004(2):106-112.
    [32]Tuteja R,Tuteja N:Serial analysis of gene expression(SAGE):unraveling the bioinformatics tools.Bioessays 2004,26(8):916-922.
    [33]van Ruissen F,Baas F:Serial analysis of gene expression(SAGE).Methods Mol Biol 2007, 383:41-66.
    [34]Son CG,Bilke S,Davis S,Greer BT,Wei JS,Whiteford CC et al:Database of mRNA gene expression profiles of multiple human organs.Genome Res 2005,15(3):443-450.
    [35]Stoic V,Gauhar Z,Mason C,Halasz G,van Batenburg MF,Rifkin SA et al:A gene expression map for the euchromatic genome of Drosophila melanogaster.Science 2004,306(5696):655-660.
    [36]Koutsos AC,Blass C,Meister S,Schmidt S,MacCallum RM,Soares MB et al:Life cycle transcriptome of the malaria mosquito Anopheles gambiae and comparison with the fruitfly Drosophila melanogaster.Proc Natl Acad Sci U S A 2007,104(27):11304-11309.
    [37]Knight RD,Shimeld SM:Identification of conserved C2H2 zinc-finger gene families in the Bilateria.Genome Biol2001,2(5):RESEARCH0016.
    [38]Chung HR,Schafer U,Jackle H,Bohm S:Genomic expansion and clustering of ZAD-containing C2H2 zinc-finger genes in Drosophila.EMBO Rep 2002,3(12):1158-1162.
    [39]Clarke ND,Berg JM:Zinc fingers in Caenorhabditis elegans:finding families and probing pathways.Science 1998,282(5396):2018-2022.
    [40]Miller J,McLachlan AD,Klug A:Repetitive zinc-binding domains in the protein transcription factor ⅢA from Ⅹenopus oocytes.Embo J 1985,4(6):1609-1614.
    [41]Wolfe SA,Nekludova L,Pabo CO:DNA recognition by Cys2His2 zinc finger proteins.Annu Rev Biophys Biomol Struct 2000,29:183-212.
    [42]Lee MS,Gippert GP,Soman KV,Case DA,Wright PE:Three-dimensional solution structure of a single zinc finger DNA-binding domain.Science 1989,245(4918):635-637.
    [43]Miura T,Satoh T,Takeuchi H:Role of metal-ligand coordination in the folding pathway of zinc finger peptides.Biochim Biophys Acta 1998,1384(1):171-179.
    [44]Posewitz MC,Wilcox DE:Properties of the Spl zinc finger 3 peptide:coordination chemistry,redox reactions,and metal binding competition with metallothionein.Chem Res Toxicol 1995,8(8):1020-1028.
    [45]Papworth M,Kolasinska P,Minczuk M:Designer zinc-finger proteins and their applications.Gene 2006,366(1):27-38.
    [46]Huntley S,Baggott DM,Hamilton AT,Tran-Gyamfi M,Yang S,Kim J et al:A comprehensive catalog of human KRAB-associated zinc finger genes:insights into the evolutionary history of a large family of transcriptional repressors.Genome Res 2006,16(5):669-677.
    [47]Bohm S,Frishman D,Mewes HW:Variations of the C2H2 zinc finger motif in the yeast genome and classification of yeast zinc finger proteins.Nucleic Acids Res 1997,25(12):2464-2469.
    [48]Bouhouche N,Syvanen M,Kado CI:The origin of prokaryotic C2H2 zinc finger regulators.Trends Microbiol 2000,8(2):77-81.
    [49]Klingler M:The organization of the antero-posterior axis.Semin Cell Biol 1990,1(3):151-160.
    [50]Papatsenko D,Levine MS:Dual regulation by the Hunchback gradient in the Drosophila embryo.Proc Natl Acad Sci U S A 2008,105(8):2901-2906.
    [51]Desjarlais JR,Berg JM:Toward rules relating zinc finger protein sequences and DNA binding site preferences.Proc Natl Acad Sci USA 1992,89(16):7345-7349.
    [52]Urnov FD,Miller JC,Lee YL,Beausejour CM,Rock JM,Augustus S et al:Highly efficient endogenous human gene correction using designed zinc-finger nucleases.Nature 2005,435(7042):646-651.
    [53]Rebar EJ,Huang Y,Hickey R,Nath AK,Meoli D,Nath S et al:Induction of angiogenesis in a mouse model using engineered transcription factors.Nat Med 2002,8(12):1427-1432.
    [54]Mita K,Kasahara M,Sasaki S,Nagayasu Y,Yamada T,Kanamori H et al:The genome sequence of silkworm,Bombyx moil.DNA Res 2004,11(1):27-35.
    [55]Cheng T,Zhao P,Liu C,Xu P,Gao Z,Xia Q et al:Structures,regulatory regions,and inductive expression patterns of antimicrobial peptide genes in the silkworm Bombyx moil.Genomics 2006,87(3):356-365.
    [56]Fujii T,Shimada T:Sex determination in the silkworm,Bombyx moil:a female determinant on the W chromosome and the sex-determining gene cascade.Semin Cell Dev Biol 2007,18(3):379-388.
    [57]The International Silkworm Genome Sequencing Consortium,2008.Silkworm genome sequence reveals biology underlying silk production,phytophagy,and metamorphosis.submitted.
    [58]邱咏梅,夏庆友,程道军,沈以红,刘春,林英,查幸福,向仲怀:家蚕母性基因的表达序列标签分析.昆虫学报2001,47(2):159-165.
    [59]程道军,夏庆友,周泽扬,鲁成,向仲怀:家蚕cDNA文库构建及大规模EST测序.蚕业科学2003,29(4):335-339
    [60]Zhong B,Yu Y,Xu Y,Yu H,Lu X,Miao Y et al:Analysis of ESTs and gene expression patterns of the posterior silkgland in the fifth instar larvae of silkworm,Bombyx mori L.Sci China C Life Sci 2005,48(1):25-33.
    [61]Mita K,Morimyo M,Okano K,Koike Y,Nohata J,Kawasaki H et al:The construction of an EST database for Bombyx mori and its application.Proc Natl Acad Sci U S A 2003,100(24):14121-14126.
    [62]Huang J,Miao X,Jin W,Couble P,Mita K,Zhang Y et al:Serial analysis of gene expression in the silkworm,Bombyx mori.Genomics 2005,86(2):233-241.
    [63]Huang J,Miao X,Jin W,Couble P,Zhang Y,Liu W et al:Radiation-induced changes in gene expression in the silkworm revealed by serial analysis of gene expression(SAGE).Insect Mol Biol 2005,14(6):665-674.
    [64]Noji T,Ote M,Takeda M,Mita K,Shimada T,Kawasaki H:Isolation and comparison of different ecdysone-responsive cuticle protein genes in wing discs of Bombyx mori.Insect Biochem Mol Biol 2003,33(7):671-679.
    [65]Ote M,Mita K,Kawasaki H,Seki M,Nohata J,Kobayashi M et al:Microarray analysis of gene expression profiles in wing discs of Bombyx mori during pupal ecdysis.Insect Biochem Moi Biol 2004,34(8):775-784.
    [66]Hong SM,Nho SK,Kim NS,Lee JS,Kang SW:Gene expression profiling in the silkworm,Bombyx mori,during early embryonic development.Zoolog Sci 2006,23(6):517-528.
    [67]Wang J,Xia Q,He X,Dai M,Ruan J,Chen J et al:SilkDB:a knowledgebase for silkworm biology and genomics.Nucleic Acids Res 2005,33(Database issue):D399-402.
    [68]Prasad MD,Muthulakshmi M,Arunkumar KP,Madhu M,Sreenu VB,Pavithra V et al:SilkSatDb:a microsatellite database of the silkworm,Bombyx moil.Nucleic Acids Res 2005,33(Database issue):D403-406.
    [69]Hideyuki Kajiwara KN,Jiang Piyang,Atsue Imamaki,Yoko Ito,Fumio Togasaki,Tsuyoshi Kotake,Hikari Murai:Draft of silkworm proteome database Proc Natl Acad Sci U S A 2008,105(8):2901-2906.
    [70]Elsik CG,Mackey AJ,Reese JT,Milshina NV,Roos DS,Weinstock GM:Creating a honey bee consensus gene set.Genome Biol 2007,8(1):R13.
    [71]Altschul SF,Madden TL,Schaffer AA,Zhang J,Zhang Z,Miller W et al:Gapped BLAST and PSI-BLAST:a new generation of protein database search programs.Nucleic Acids Res 1997,25(17):3389-3402.
    [72]Benson D,Lipman DJ,Ostell J:GenBank.Nucleic Acids Res 1993,21(13):2963-2965.
    [73]Ye J,Fang L,Zheng H,Zhang Y,Chen J,Zhang Z et al:WEGO:a web tool for plotting GO annotations.Nucleic Acids Res 2006,34(Web Server issue):W293-297.
    [74]Jekely G,Friedrich P:Characterization of two recombinant Drosophila calpains.CALPA and a novel homolog,CALPB.J Biol Chem 1999,274(34):23893-23900.
    [75]Arent S,Pye VE,Henriksen A:Structure and function of plant acyl-CoA oxidases.Plant Physiol Biochern 2008,46(3):292-301.
    [76]Noseda M,Karsan A:Notch and minichromosome maintenance(MCM)proteins:integration of two ancestral pathways in cell cycle control.Cell Cycle 2006,5(23):2704-2709.
    [77] Kaneko M, Nighorn A: Interaxonal Eph-ephrin signaling may mediate sorting of olfactory sensory axons in Manduca sexta. J Neurosci 2003, 23(37):11523-11538.
    [78] Shin SW, Park SS, Park DS, Kim MG, Kim SC, Brey PT et al: Isolation and characterization of immune-related genes from the fall webworm, Hyphantria cunea, using PCR-based differential display and subtractive cloning. Insect Biochem MolBiol 1998,28(11):827-837.
    [79] Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR et al: The Pfam protein families database. Nucleic Acids Res 2008, 36(Database issue):D281-288.
    [80] Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J et al: The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res 2004, 32(Database issue):D262-266.
    [81] Li H, Coghlan A, Ruan J, Coin LJ, Heriche JK, Osmotherly L et al: TreeFam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res 2006, 34(Database issue):D572-580.
    [82] Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 1998,95(25):14863-14868.
    [83] Michaille JJ, Couble P, Prudhomme JC, Garel A: A single gene produces multiple sericin messenger RNAs in the silk gland of Bombyx mori. Biochimie 1986, 68(10-ll):1165-1173.
    [84] Byeon GM, Lee KS, Gui ZZ, Kim I, Kang PD, Lee SM et al: A digestive beta-glucosidase from the silkworm, Bombyx mori: cDNA cloning, expression and enzymatic characterization. Comp Biochem Physiol B Biochem Mol Biol 2005, 141 (4):418-427.
    [85] Miyagawa Y, Lee JM, Maeda T, Koga K, Kawaguchi Y, Kusakabe T: Differential expression of a Bombyx mori AHA1 homologue during spermatogenesis. Insect Mol Biol 2005, 14(3):245-253.
    [86] Ota A, Kusakabe T, Sugimoto Y, Takahashi M, Nakajima Y, Kawaguchi Y et al: Cloning and characterization of testis-specific tektin in Bombyx mori. Comp Biochem Physiol B Biochem Mol Biol 2002,133(3):371-382.
    [87] Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A et al: The generic genome browser: a building block for a model organism system database. Genome Res 2002,12(10): 1599-1610.
    [88] Letondal C: A Web interface generator for molecular biology programs in Unix. Bioinformatics 2001, 17(1):73-82.
    [89] Xia Q, Cheng D, Duan J, Wang G, Cheng T, Zha X et al: Microarray-based gene expression profiles in multiple tissues of the domesticated silkworm, Bombyx mori. Genome Biol 2007, 8(8):R162.
    [90] Yamamoto K, Nohata J, Kadono-Okuda K, Narukawa J, Sasanuma M, Sasanuma SI et al: A BAC-based integrated linkage map of the silkworm Bombyx mori. Genome Biol 2008,9(1):R21.
    [91] Nishita Y, Takiya S: Structure and expression of the gene encoding a Broad-Complex homolog in the silkworm, Bombyx mori. Gene 2004, 339:161-172.
    [92] Xu X, Xu PX, Amanai K, Suzuki Y: Double-segment defining role of even-skipped homologs along the evolution of insect pattern formation. Dev Growth Differ 1997, 39(4):515-522.
    [93] Eddy SR: Profile hidden Markov models. Bioinformatics 1998,14(9):755-763.
    [94] Enright AJ, Van Dongen S, Ouzounis CA: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 2002, 30(7):1575-1584.
    [95] Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994,22(22):4673-4680.
    [96] Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 2007, 24(8): 1596-1599.
    [97] Jauch R, Bourenkov GP, Chung HR, Urlaub H, Reidt U, Jackie H et al: The zinc finger-associated domain of the Drosophila transcription factor grauzone is a novel zinc-coordinating protein-protein interaction module. Structure 2003, 11(11):1393-1402.
    [98] Rideout EJ, Billeter JC, Goodwin SF: The sex-determination genes fruitless and doublesex specify a neural substrate required for courtship song. Curr Biol 2007, 17(17):1473-1478.
    [99] Suske G, Bruford E, Philipsen S: Mammalian SP/KLF transcription factors: bring in the family. Genomics 2005, 85(5):551-556.
    [100] De Graeve F, Smaldone S, Laub F, Mlodzik M, Bhat M, Ramirez F: Identification of the Drosophila progenitor of mammalian Kruppel-like factors 6 and 7 and a determinant of fly development. Gene 2003, 314:55-62.
    [101] Munoz-Descalzo S, Terol J, Paricio N: Cabut, a C2H2 zinc finger transcription factor, is required during Drosophila dorsal closure downstream of JNK signaling. Dev Biol 2005, 287(1):168-179.
    [102] Herke SW, Serio NV, Rogers BT: Functional analyses of tiptop and antennapedia in the embryonic development of Oncopeltus fasciatus suggests an evolutionary pathway from ground state to insect legs. Development 2005,132(1):27-34.
    [103] Hu S, Fambrough D, Atashi JR, Goodman CS, Crews ST: The Drosophila abrupt gene encodes a BTB-zinc finger regulatory protein that controls the specificity of neuromuscular connections. Genes Dev 1995, 9(23):2936-2948.
    [104] D'Avino PP, Thummel CS: crooked legs encodes a family of zinc finger proteins required for leg morphogenesis and ecdysone-regulated gene expression during Drosophila metamorphosis.Development 1998,125(9):1733-1745.
    [105]Demir E,Dickson B J:fruitless splicing specifies male courtship behavior in Drosophila.Cell 2005,121(5):785-794.
    [106]Ohbayash F,Suzuki MG,Shimada T:Sex determination in Bombyx moil.CURRENT SCIENCE 2002,9(83):446-471.
    [107]Berger J,Bird A:Role of MBD2 in gene regulation and tumorigenesis.Biochem Soc Trans 2005,33(Pt 6):1537-1540.
    [108]Brayer KJ,Kulshreshtha S,Segal DJ:The Protein-Binding Potential of C2H2 Zinc Finger Domains.Cell Biochem Biophys 2008.
    [109]Ekman D,Light S,Bjorklund AK,Elofsson A:What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae? Genome Biol 2006,7(6):R45.
    [110]Dehal P,Predki P,Olsen AS,Kobayashi A,Folta P,Lucas S et al:Human chromosome 19 and related regions in mouse:conservative and lineage-specific evolution.Science 2001,293(5527):104-111.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700