Hi everyone! We have a guest speaker Samuele Tosatto from TU Darmstadt and you are all cordially invited to the AMLab Seminar on February 18th (Thursday) at 4:00 p.m. CET on Zoom. And then Samuele will give a talk titled “Movement Representation and Off-Policy Reinforcement Learning for Robotic Manipulation“.
Title: Movement Representation and Off-Policy Reinforcement Learning for Robotic Manipulation
Abstract: Machine learning, and more particularly, reinforcement learning, holds the promise of making robots more adaptable to new tasks and situations.
However, the general sample inefficiency and lack of safety guarantees make reinforcement learning hard to apply directly to robotic systems.
To mitigate the aforementioned issues, we focus on two aspects of the learning scheme.
The first aspect regards robotic movements. Robotic movements are crucial in manipulation tasks. The usual parametrization of robotic movements allows high expressivity but is usually inefficient, as it covers movements not relevant to the task. We analyze how to focus the representation of only those movements relevant to the considered task. This novel representation has the effect of ameliorating the sample efficiency and providing higher safety.
The low quality of a gradient estimator in reinforcement learning can cause another source of inefficiency. On-policy gradient estimators are usually easy to obtain, but they are, due to their nature, sample inefficient. In contrast, state-of-the-art off-policy solutions are challenging to compute. These estimators are typically divided into importance-sampling and semi-gradient approaches. The first suffers from high variance, while the second suffers from high bias. In this talk, we show a third way to compute off-policy gradients that exhibit a fair bias/variance tradeoff using a closed-form solution of a proposed non-parametric Bellman equation. Using this estimator results in a particular high sample efficiency. Our algorithm can be applied offline on human-demonstrated data, providing a safe scheme that avoids dangerous interaction with the real robot.
To gain more insight into Reinforcement Learning and Robotics, feel free to join and discuss it! See you there 🙂 !