用户名: 密码: 验证码:
Semantic text mining support for lignocellulose research
详细信息    查看全文
  • 作者:Marie-Jean Meurs (1) (2)
    Caitlin Murphy (2) (3)
    Ingo Morgenstern (2) (3)
    Greg Butler (1) (2)
    Justin Powlowski (2) (4)
    Adrian Tsang (2) (3)
    René Witte (1)
  • 刊名:BMC Medical Informatics and Decision Making
  • 出版年:2012
  • 出版时间:December 2012
  • 年:2012
  • 卷:12
  • 期:1-supp
  • 全文大小:1270KB
  • 参考文献:1. Demirbas A: Political, economic and environmental impacts of biofuels: a review. / Applied Energy 2009,86(Suppl 1):S108-S117. CrossRef
    2. Bringezu S, Schütz H, O'Brien M, Kauppi L, Howarth RW, McNelly J: Towards sustainable production and use of resources: assessing biofuels. In / Tech Rep. United Nations Environment Programme; 2009.
    3. Jovanovic I, Magnuson J, Collart F, Robbertse B, Adney W, Himmel M, Baker S: Fungal glycoside hydrolases for saccharification of lignocellulose: outlook for new discoveries fueled by genomics and functional studies. / Cellulose 2009, 16:687-97. CrossRef
    4. Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Lu Z, Madden TL, Madej T, Maglott DR, Marchler-Bauer A, Miller V, Mizrachi I, Ostell J, Panchenko A, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Slotta D, Souvorov A, Starchenko G, Tatusova TA, Wagner L, Wang Y, John Wilbur W, Yaschenko E, Ye J: Database resources of the National Center for Biotechnology Information. / Nucleic Acids Res 2009,38(Suppl 1):D5-D16. CrossRef
    5. Scheer M, Grote A, Chang A, Schomburg I, Munaretto C, Rother M, S?hngen C, Stelzer M, Thiele J, Schomburg D: BRENDA, the enzyme information system in 2011. / Nucleic Acids Res 2011,39(Database issue):D670-D676. CrossRef
    6. Ananiadou S, McNaught J: / Text Mining for Biology and Biomedicine. Norwood, MA, USA: Artech House, Inc; 2005.
    7. Baker CJO, Cheung KH (Eds): / Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences. Springer; 2007.
    8. Shadbolt N, Berners-Lee T, Hall W: The semantic web revisited. / IEEE Intell Syst 21(3):96-01.
    9. Müller HM, Kenny EE, Sternberg PW: Textpresso: an ontology-based information retrieval and extraction system for biological literature. / PLoS Biol 2004,2(11):e309. CrossRef
    10. Doms A, Schroeder M: GoPubMed: exploring PubMed with the Gene Ontology. / Nucleic Acids Res 2005,33(Web Server issue):W783-W786. CrossRef
    11. Hoffmann R, Valencia A: A gene network for navigating the literature. / Nat Genet 2004, 36:664. CrossRef
    12. Bernard DC, Buxton BF, Langdon WB, Jones DT: BioRAT: extracting biological information from full-length papers. / Bioinformatics 2004, 20:3206-213. CrossRef
    13. G?rg C, Tipney H, Verspoor K, Baumgartner W, Cohen K, Stasko J, Hunter L: Visualization and language processing for supporting analysis across the biomedical literature. In / Knowledge-Based and Intelligent Information and Engineering Systems, Volume 6279 of Lecture Notes in Computer Science. Edited by: Setchi R, Jordanov I, Howlett R, Jain L. Springer Berlin/Heidelberg; 2010:420-29.
    14. Witte R, Kappler T, Baker CJO: Ontology design for biomedical text mining. In / Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences. Edited by: Baker CJO, Cheung KH. Springer; 2007:281-13.
    15. Pafilis E, O'Donoghue SI, Jensen LJ, Horn H, Kuhn M, Brown NP, Schneider R: Reflect: augmented browsing for the life scientist. / Nat Biotechnol 2009, 27:508-10. CrossRef
    16. Murphy C, Powlowski J, Wu M, Butler G, Tsang A: Curation of characterized glycoside hydrolases of fungal origin. / Database (Oxford) 2011, 2011:bar020. CrossRef
    17. Federhen S: The Taxonomy Project. In / The NCBI Handbook. Edited by: McEntyre J, Ostell J. National Library of Medicine (US), National Center for Biotechnology Information; 2003.
    18. UniProt Consortium: The Universal Protein Resource (UniProt) 2009. / Nucleic Acids Res 2009,37(Database issue):D169-D174. CrossRef
    19. Witte R, Gitzinger T: Semantic assistants - user-centric natural language processing services for desktop clients. In / 3rd Asian Semantic Web Conference (ASWC 2008), Volume 5367 of LNCS, Bangkok, Thailand. Springer; 2009:360-74.
    20. Cunningham H, Maynard D, Bontcheva K, Tablan V, Aswani N, Roberts I, Gorrell G, Funk A, Roberts A, Damljanovic D, Heitz T, Greenwood MA, Saggion H, Petrak J, Li Y, Peters W: [http://tinyurl.com/gatebook] / Text Processing with GATE (Version 6). University of Sheffield, Department of Computer Science; 2011.
    21. Witte R, Khamis N, Rilling J: Flexible ontology population from text: the OwlExporter. In / The Seventh International Conference on Language Resources and Evaluation (LREC 2010). Valletta, Malta: ELRA; 2010:3845-850.
    22. Naderi N, Kappler T, Baker CJ, Witte R: OrganismTagger: detection, normalization, and grounding of organism entities in biomedical documents. / Bioinformatics 2011,27(19):2721-729. CrossRef
    23. International Union of Biochemistry and Molecular Biology: / Enzyme Nomenclature. San Diego, California: Academic Press; 1992.
    24. Saha BC: Production, purification and properties of endoglucanase from a newly isolated strain of Mucor circinelloides. / Process Biochemistry 2004,39(12):1871-876. CrossRef
    25. Bontcheva K, Cunningham H, Roberts I, Tablan V: Web-based collaborative corpus annotation: requirements and a framework implementation. In / New Challenges for NLP Frameworks. Valletta, Malta: ELRA; 2010:20-7.
    26. Okazaki N, Ananiadou S, Tsujii J: Building a high-quality sense inventory for improved abbreviation disambiguation. / Bioinformatics 2010,26(9):1246-253. CrossRef
    27. Yamamoto Y, Yamaguchi A, Bono H, Takagi T: Allie: a database and a search service of abbreviations and long forms. / Database (Oxford) 2011, 2011:bar013. CrossRef
  • 作者单位:Marie-Jean Meurs (1) (2)
    Caitlin Murphy (2) (3)
    Ingo Morgenstern (2) (3)
    Greg Butler (1) (2)
    Justin Powlowski (2) (4)
    Adrian Tsang (2) (3)
    René Witte (1)

    1. Department of Computer Science and Software Engineering, Concordia University, Montréal, QC, Canada
    2. Centre for Structural and Functional Genomics, Concordia University, Montréal, QC, Canada
    3. Department of Biology, Concordia University, Montreal, QC, Canada
    4. Department of Chemistry and Biochemistry, Concordia University, Montréal, QC, Canada
文摘
Background Biofuels produced from biomass are considered to be promising sustainable alternatives to fossil fuels. The conversion of lignocellulose into fermentable sugars for biofuels production requires the use of enzyme cocktails that can efficiently and economically hydrolyze lignocellulosic biomass. As many fungi naturally break down lignocellulose, the identification and characterization of the enzymes involved is a key challenge in the research and development of biomass-derived products and fuels. One approach to meeting this challenge is to mine the rapidly-expanding repertoire of microbial genomes for enzymes with the appropriate catalytic properties. Results Semantic technologies, including natural language processing, ontologies, semantic Web services and Web-based collaboration tools, promise to support users in handling complex data, thereby facilitating knowledge-intensive tasks. An ongoing challenge is to select the appropriate technologies and combine them in a coherent system that brings measurable improvements to the users. We present our ongoing development of a semantic infrastructure in support of genomics-based lignocellulose research. Part of this effort is the automated curation of knowledge from information on fungal enzymes that is available in the literature and genome resources. Conclusions Working closely with fungal biology researchers who manually curate the existing literature, we developed ontological natural language processing pipelines integrated in a Web-based interface to assist them in two main tasks: mining the literature for relevant knowledge, and at the same time providing rich and semantically linked information.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700