Universität Bielefeld Electronic Collections animiertes Foto Universität Bielefeld

Zugang zum Dokument

Single pass clustering for large data sets

Alex, Nikolai ; Hammer, Barbara ; Klawonn, Frank

The presence of very large data sets poses new problems to standard neural clustering and visualization algorithms such as Neural Gas (NG) and the Self-Organizing-Map (SOM) due to memory and time constraints. In such situations, it is no longer possible to store all data points in the main memory at once and only a few, ideally only one run over the whole data set is still affordable to achieve a feasible training time. In this contribution we propose single pass extensions of the classical clustering algorithms NG and fuzzy-k-means which are based on a simple patch decomposition of the data set and fast batch optimization schemes of the respective cost function. The algorithms maintain the benefits of the original ones including easy implementation and interpretation as well as large flexibility and adaptability because of the underlying cost function. We demonstrate the efficiency of the approach in a variety of experiments.

Schlagwörter: clustering, neural gas, k-means, large data sets, batch optimization
Beteiligte Einrichtung: Technische Fakultät, Arbeitsgruppen der Informatik
DDC-Sachgruppe: Datenverarbeitung, Informatik

Alex, Nikolai ; Hammer, Barbara ; Klawonn, Frank  (2007)  Single pass clustering for large data sets.

URL: http://biecoll.ub.uni-bielefeld.de/volltexte/2007/147

 Fragen und Anregungen an: publikationsdienste.ub@uni-bielefeld.de
 Letzte Änderung: 15.2.2011
OPUS-Logo     OAI-zertifiziert      Universitätsbibliothek Bielefeld