Cross-Modal Learning of Visual Categories using Different Levels of Supervision

Authors

  • Mario Fritz
  • Geert-Jan M. Kruijff
  • Bernt Schiele

DOI:

https://doi.org/10.2390/biecoll-icvs2007-124

Keywords:

object categorization, cross-modal learning, incremental and interactive learning, DDC: 004 (Data processing, computer science, computer systems)

Abstract

Today's object categorization methods use either supervised or unsupervised training methods. While supervised methods tend to produce more accurate results, unsupervised methods are highly attractive due to their potential to use far more and unlabeled training data. This paper proposes a novel method that uses unsupervised training to obtain visual groupings of objects and a cross-modal learning scheme to overcome inherent limitations of purely unsupervised training. The method uses a unified and scale-invariant object representation that allows to handle labeled as well as unlabeled information in a coherent way. One of the potential settings is to learn object category models from many unlabeled observations and a few dialogue interactions that can be ambiguous or even erroneous. First experiments demonstrate the ability of the system to learn meaningful generalizations across objects already from a few dialogue interactions.

Downloads

Published

2007-12-31

Issue

Section

The 5th International Conference on Computer Vision Systems