Access to the Document
Classification methods for finding articles describing protein-protein interactions in PubMed
Matos, Sérgio ; Oliveira, José Luís
Journal of Integrative Bioinformatics - JIB (ISSN 1613-4516)
With the rapid expansion in the number of published papers in the biomedical field, finding relevant articles has become a demanding task for researchers. This has led to increasing interest in the use of text mining tools that help search the literature and identify the most relevant documents or information. One specific topic of interest is related to the identification of articles that might be used for extracting protein-protein interactions. Using the BioCreative III Article Classification Task dataset, composed of PubMed abstracts classified as relevant or non-relevant for describing protein-protein interactions, we compare different classification methods with different sets of features. The best results - area under the interpolated precision-recall curve of 0.654 - indicate that the proposed classification strategy could be incorporated in the database curation workflows in order to prioritize articles for extraction of protein-protein interactions. Furthermore, we also analysed the use of this method for ranking documents resulting from general PubMed queries, and propose that this approach could be useful for general researchers looking for publications describing protein-protein interactions within a particular topic of interest.
||Faculty of Technology, Research Groups in Informatics
||Data processing, computer science, computer systems