Self-Organizing Word Map for Context-Based Document Classification

  • Nikolaos Tsimboukakis
  • George Tambouratzis
Schlagworte: document classification, word map, hybrid neural network architecture, SOM, MLP, DDC: 004 (Data processing, computer science, computer systems)


In this paper, a novel SOM-based system for document organization is presented. The purpose of the system is the classification of a document collection in terms of document content. The system possesses a two-level hybrid connectionist architecture that comprises (i) an automatically created word map using a SOM, which functions as a feature extraction module and (ii) a supervised MLP-based classifier, which provides the final classification result. The experiments, which have been performed on Modern Greek text documents, indicate that the proposed system separates effectively the different types of text.