Mapping protein information to disease terminologies

Mottaz, Anaïs ; Yip, Yum L. ; Ruch, Patrick ; Veuthey, Anne-Lise

Journal of Integrative Bioinformatics - JIB (ISSN 1613-4516)

In order to improve the accessibility of genomic and proteomic information to medical researchers, we have developed a procedure to link biological information on proteins involved in diseases to the MeSH and ICD-10 disease terminologies. For this purpose, we took advantage of the manually curated disease annotations in more than 2,000 human protein entries of the UniProt KnowledgeBase. We mapped disease names extracted from the entry comment lines or from the corresponding OMIM entry to the MeSH. The method was assessed on a benchmark set of 200 manually mapped disease comment lines. We obtained a recall of 54% for 91% precision. The same procedure was used to map the more than 3,000 diseases in Swiss-Prot to MeSH with comparable efficiency. Tested on ICD-10, the coverage of the mapped terms was lower, which could be explained by the coarse-grained structure of this terminology for hereditary disease description. The mapping is provided as supplementary material at http://research.isbsib. ch/unimed.

Mottaz, Anaïs ; Yip, Yum L. ; Ruch, Patrick ; Veuthey, Anne-Lise  (2007)  Mapping protein information to disease terminologies. Journal of Integrative Bioinformatics - JIB (ISSN 1613-4516), 4(3), 2007. Special Issue: 4th Integrative Bioinformatics Workshop, Gent, Belgium

Online-Journal: http://journal.imbio.de/index.php?paper_id=79
URL: http://biecoll.ub.uni-bielefeld.de/volltexte/2007/260

