Universität Bielefeld Electronic Collections animiertes Foto Universität Bielefeld

Zugang zum Dokument



Data Cleaning and Semantic Improvement in Biological Databases

Apiletti, Daniele ; Bruno, Giulia ; Ficarra, Elisa ; Baralis, Elena

Journal of Integrative Bioinformatics - JIB (ISSN 1613-4516)



Abstract:
Public genomic and proteomic databases can be affected by a variety of errors. These errors may involve either the description or the meaning of data (namely, syntactic or semantic errors). We focus our analysis on the detection of semantic errors, in order to verify the accuracy of the stored information. In particular, we address the issue of data constraints and functional dependencies among attributes in a given relational database. Constraints and dependencies show semantics among attributes in a database schema and their knowledge may be exploited to improve data quality and integration in database design, and to perform query optimization and dimensional reduction. We propose a method to discover data constraints and functional dependencies by means of association rule mining. Association rules are extracted among attribute values and allows us to find causality relationships among them. Then, by analyzing the support and confidence of each rule, (probabilistic) data constraints and functional dependencies may be detected. With our method we can both show the presence of erroneous data and highlight novel semantic information. Moreover, our method is database-independent because it infers rules from data. In this paper, we report the application of our techniques to the SCOP (Structural Classification of Proteins) and CATH Protein Structure Classification databases.


Beteiligte Einrichtung: Technische Fakultät, Arbeitsgruppen der Informatik
DDC-Sachgruppe: Datenverarbeitung, Informatik

Zitat-Vorschlag:
Apiletti, Daniele ; Bruno, Giulia ; Ficarra, Elisa ; Baralis, Elena  (2006)  Data Cleaning and Semantic Improvement in Biological Databases. Journal of Integrative Bioinformatics - JIB (ISSN 1613-4516), 3(2), 2006. Special Issue: 3rd Integrative Bioinformatics Workshop, Harpenden, United Kingdom, 2

Online-Journal: http://journal.imbio.de/index.php?paper_id=40
URL: http://biecoll.ub.uni-bielefeld.de/volltexte/2007/221

Also published by Shaker:
Ralf Hofestädt, Thoralf Töpel (eds.). Integrative Bioinformatics -
Yearbook 2006. Shaker, 2007.


 Fragen und Anregungen an: publikationsdienste.ub@uni-bielefeld.de
 Letzte Änderung: 15.2.2011
 Impressum
OPUS-Logo     OAI-zertifiziert      Universitätsbibliothek Bielefeld
OAI-Logo