|Title||Skills transfer across dissimilar robots by learning context-dependent rewards|
|Publication Type||Conference Proceedings|
|Year of Conference||2013|
|Authors||Malekzadeh MS, Bruno D, Calinon S, Nanayakkara T, Caldwell DG|
|Conference Name||IEEE/RSJ Intl Conf. on Intelligent Robots and Systems (IROS)|
|Conference Location||Tokyo, Japan|
Robot programming by demonstration encompasses a wide range of learning strategies, from simple mimicking of the demonstrator's actions to the higher level extraction of the underlying intent. Focussing on this last form, we study the problem of extracting the reward function explaining the demonstrations from a set of candidate reward functions, and using this information for self-refinement of the skill. This definition of the problem has links with inverse reinforcement learning problems in which the robot autonomously extracts an optimal reward function that defines the goal of the task. By relying on Gaussian mixture model, the proposed approach learns how the different candidate reward functions are combined, and in which contexts or phases of the overall task, they are relevant for explaining the user's demonstrations. The extracted reward profile is then exploited to improve the skill with an expectation-maximization based self-refinement approach, allowing the imitator to reach a higher skill level than the demonstrator. The approach can be used in reproducing the same skills in different ways or transferring them across agents of different structures. The proposed approach is tested in simulation with a new type of continuum robot, using kinesthetic demonstrations coming from a Barrett WAM manipulator.