Towards Spatial Perception: Learning to Locate Objects From Vision


  • Jürgen Leitner
  • Simon Harding
  • Mikhail Frank
  • Alexander Förster
  • Jürgen Schmidhuber



spatial understanding, object localisation, humanoid robot, neural network, genetic programming, DDC: 004 (Data processing, computer science, computer systems)


Our humanoid robot learns to provide position estimates of objects placed on a table, even while the robot is moving its torso, head and eyes (cm range accuracy). These estimates are provided by trained artificial neural networks (ANN) and a genetic programming (GP) method, based solely on the inputs from the two cameras and the joint encoder positions. No prior camera calibration and kinematic model is used. We find that ANN and GP are both able to localise objects robustly regardless of the robot's pose and without an explicit kinematic model or camera calibration. These approaches yield an accuracy comparable to current techniques used on the iCub.