Universität Bielefeld Electronic Collections animiertes Foto Universität Bielefeld

Access to the Document



Characterization of Genetic Signal Sequences with Batch-Learning SOM

Abe, Takashi ; Ikeda, Shun ; Kanaya, Shigehiko ; Wada, Kennosuke ; Ikemura, Toshimichi



Download file

Abstract:
An unsupervised clustering algorithm Kohonen's SOM is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We previously modified the conventional SOM for genome informatics, making the learning process and resulting map independent of the order of data input on the basis of Batch Learning SOM (BL-SOM). We generated BL-SOMs for tetra- and pentanucleotide frequencies in 300,000 10-kb sequences from 13 eukaryotes for which almost complete genomic sequences are available. BL-SOM recognized species-specific characteristics of oligonucleotide frequencies in most 10-kb sequences, permitting species-specific classification of sequences without any information regarding the species. We next constructed BL-SOMs with tetra- and pentanucleotide frequencies in 37,086 full-length mouse cDNA sequences. With BL-SOM we also analyzed occurrence patterns of the oligonucleotides that are thought to be involved in transcriptional regulation on the human genome.


Keywords: batch learning SOM, BL-SOM, oligonucleotide frequency, the Earth Simulator, genome informatics
Institution: Faculty of Technology, Research Groups in Informatics
DDC classification: Data processing, computer science, computer systems

Suggested Citation:
Abe, Takashi ; Ikeda, Shun ; Kanaya, Shigehiko ; Wada, Kennosuke ; Ikemura, Toshimichi  (2007)  Characterization of Genetic Signal Sequences with Batch-Learning SOM.


URL: http://biecoll.ub.uni-bielefeld.de/volltexte/2007/127



 Questions or comments: publikationsdienste.ub@uni-bielefeld.de
 Latest update: 15 Feb 2011
 Legal Notice
OPUS-Logo     OAI compliant      BU Logo
OAI-Logo