Online learning of vision-based robot control requires appropriate activation strategies during operation. In this chapter we present such a learning approach with applications to two areas of vision-based robot control. In the first setting, selfevaluation is possible for the learning system and the system autonomously switches to learning mode for producing the necessary training data by exploration. The other application is in a setting where external information is required for determining the correctness of an action. Therefore, an operator provides training data when required, leading to an automatic mode switch to online learning from demonstration. In experiments for the first setting, the system is able to autonomously learn the inverse kinematics of a robotic arm. We propose improvements producing more informative training data compared to random exploration. This reduces training time and limits learning to regions where the learnt mapping is used. The learnt region is extended autonomously on demand. In experiments for the second setting, we present an autonomous driving system learning a mapping from visual input to control signals, which is trained by manually steering the robot. After the initial training period, the system seamlessly continues autonomously. Manual control can be taken back at any time for providing additional training.
In tele-operated robotics applications, the primary information channel from the robot to its human operator is a video stream. For autonomous robotic systems however, a much larger selection of sensors is employed, although the most relevant information for the operation of the robot is still available in a single video stream. The issue lies in autonomously interpreting the visual data and extracting the relevant information, something humans and animals perform strikingly well. On the other hand, humans have great diculty expressing what they are actually looking for on a low level, suitable for direct implementation on a machine. For instance objects tend to be already detected when the visual information reaches the conscious mind, with almost no clues remaining regarding how the object was identied in the rst place. This became apparent already when Seymour Papert gathered a group of summer workers to solve the computer vision problem 48 years ago .
Articial learning systems can overcome this gap between the level of human visual reasoning and low-level machine vision processing. If a human teacher can provide examples of what to be extracted and if the learning system is able to extract the gist of these examples, the gap is bridged. There are however some special demands on a learning system for it to perform successfully in a visual context. First, low level visual input is often of high dimensionality such that the learning system needs to handle large inputs. Second, visual information is often ambiguous such that the learning system needs to be able to handle multi modal outputs, i.e. multiple hypotheses. Typically, the relations to be learned are non-linear and there is an advantage if data can be processed at video rate, even after presenting many examples to the learning system. In general, there seems to be a lack of such methods.
This thesis presents systems for learning perception-action mappings for robotic systems with visual input. A range of problems are discussed, such as vision based autonomous driving, inverse kinematics of a robotic manipulator and controlling a dynamical system. Operational systems demonstrating solutions to these problems are presented. Two dierent approaches for providing training data are explored, learning from demonstration (supervised learning) and explorative learning (self-supervised learning). A novel learning method fullling the stated demands is presented. The method, qHebb, is based on associative Hebbian learning on data in channel representation. Properties of the method are demonstrated on a vision-based autonomously driving vehicle, where the system learns to directly map low-level image features to control signals. After an initial training period, the system seamlessly continues autonomously. In a quantitative evaluation, the proposed online learning method performed comparably with state of the art batch learning methods.