Accelerating Relational Clustering Algorithms With Sparse Prototype Representation
Schlagworte: relational data, pairwise data, dissimilarity data, software implementation, DDC: 004 (Data processing, computer science, computer systems)
AbstractIn some application contexts, data are better described by a matrix of pairwise dissimilarities rather than by a vector representation. Clustering and topographic mapping algorithms have been adapted to this type of data, either via the generalized Median principle, or more recently with the so called relational approach, in which prototypes are represented by virtual linear combinations of the original observations. One drawback of those methods is their complexity, which scales as the square of the number of observations, mainly because they use dense prototype representations: each prototype is obtained as a virtual combination of all the elements of its cluster (at least). We propose in this paper to use a sparse representation of the prototypes to obtain relational algorithms with sub-quadratic complexity.