Surprise-based segmentation of behavior for two robots: The Spherical robot changes its rolling motion; The Hexapod changes from a tripod gait to crawling in a left curve. The changes in behavior results in strong increases of the prediction error e(t) above the prediction confidence of the active model B(t). Since these types of behavior were not experienced before, new internal models are trained.
Voluntary behavior of humans seems to be composed of small, elementary building blocks or behavioral primitives. It seems like this modular organization is crucial for our ability to quickly learn complex motor skills, to flexibly adjust our behavior to new tasks, and to deal with our highly redundant motor system.
In this project we address the question how an embodied agent interacting with a complex environment can discover and learn a repertoire of useful behavioral primitives from scratch. For this task various challenging problems need to be considered: (1) The agent needs to discover new and useful types of behavior. (2) The continuous stream of sensorimotor information needs to be segmented in such way, that the underlying behavioral primitives are uncovered. (3) The agent needs to learn models of the discovered behavior, that enable the efficient combination of behavioral primitives for goal-directed control.
Our current architecture, the SUBMODES system [ ], explores different types of behavior by means of self-organizing behavior using the DEP-learning rule [ ] . While exploring its behavioral capabilities, internal models are trained to predict the motor commands and the resulting sensory consequences of the performed behavior. We use an unexpected increase in prediction error to detect the transition between two behavioral primitives. If such a 'surprising' error is registered, the internal model may either be switched to a previously learned model or a new model may be generated (illustrated in Figure). In this way, the system is able systematically structure the sensorimotor time series on-line into compositional models of behavior. After initial exploration, the agent can use its learned representations for goal-directed control by anticipating the sensory consequences of each available behavior and activating those behavioral models that bring the agent closer to a desired goal state.
<div class="videoWrapper"><iframe src="//www.youtube.com/embed/QKQnecYjmTA" frameborder="0" allowfullscreen></iframe></div>