Direct Use of Information Extraction from Scientific Text for Modeling and Simulation in the Life Sciences


  • Erfan Younesi
  • Vinod Kasam
  • Martin Hofmann-Apitius



Text mining, information extraction, in silico experiment, drug discovery, grid computing, DDC: 020 (Library and information sciences)


Purpose: To demonstrate how the information extracted from scientific text can be directly used in support of life science research projects. In modern digital-based research and academic libraries, librarians should be able to support data discovery and organization of digital entities in order to foster research projects effectively; thus we speculate that text mining and knowledge discovery tools could be of great assistance to librarians. Such tools simply enable librarians to overcome increasing complexity in the number as well as contents of scientific literature, especially in the emerging interdisciplinary fields of science. In this paper we present an example of how evidences extracted from scientific literature can be directly integrated into in silico disease models in support of drug discovery projects. Design/methodology/approach: The application of text-mining as well as knowledge discovery tools are explained in the form of a knowledge-based workflow for drug target candidate identification. Moreover, we propose an in silico experimentation framework for the enhancement of efficiency and productivity in the early steps of the drug discovery workflow. Findings: Our in silico experimentation workflow has been successfully applied to searching for hit and lead compounds in the World-wide In Silico Docking On Malaria (WISDOM) project and to finding novel inhibitor candidates. Practical implications: Direct extraction of biological information from text will ease the task of librarians in managing digital objects and supporting research projects. We expect that textual data will play an increasingly important role in evidence-based approaches taken by biomedical and translational researchers. Originality / value: Our proposed approach provides a practical example for the direct integration of text- and knowledge-based data into life science research projects, with the emphasis on its application by academic and research libraries in support of scientific projects.