Dimensionality Reduction of very large document collections by Semantic Mapping
Schlagworte:
Document Clustering, Dimensionality Reduction, Semantic Mapping, DDC: 004 (Data processing, computer science, computer systems)
Abstract
This paper describes improving in Semantic Mapping, a feature extraction method useful to dimensionality reduction of vectors representing documents of large text collections. This method may be viewed as a specialization of the Random Mapping, method proposed in WEBSOM project. Semantic Mapping, Random Mapping and Principal Component Analysis (PCA) are applied to categorization of document collections using Self-Organizing Maps (SOM). Semantic Mapping generated document representation as good as PCA and much better than Random Mapping.
Veröffentlicht
2007-12-31
Rubrik
Artikel