Extracting Strong Policies for Robotics Tasks from Zero-order Trajectory Optimizers
2021
Conference Paper
al
Solving high-dimensional, continuous robotic tasks is a challenging optimization problem. Model-based methods that rely on zero-order optimizers like the cross-entropy method (CEM) have so far shown strong performance and are considered state-of-the-art in the model-based reinforcement learning community. However, this success comes at the cost of high computational complexity, being therefore not suitable for real-time control. In this paper, we propose a technique to jointly optimize the trajectory and distill a policy, which is essential for fast execution in real robotic systems. Our method builds upon standard approaches, like guidance cost and dataset aggregation, and introduces a novel adaptive factor which prevents the optimizer from collapsing to the learner's behavior at the beginning of the training. The extracted policies reach unprecedented performance on challenging tasks as making a humanoid stand up and opening a door without reward shaping
Author(s): | Cristina Pinneri* and Shambhuraj Sawant* and Sebastian Blaes and Georg Martius |
Book Title: | The Ninth International Conference on Learning Representations (ICLR) |
Year: | 2021 |
Month: | May |
Department(s): | Autonomous Learning |
Research Project(s): |
Model-based Reinforcement Learning and Planning
|
Bibtex Type: | Conference Paper (inproceedings) |
Event Name: | 9th International Conference on Learning Representations (ICLR 2021) |
Article Number: | 1844 |
Note: | *equal contribution |
State: | Published |
URL: | https://openreview.net/forum?id=Nc3TJqbcl3 |
Links: |
OpenReview
|
BibTex @inproceedings{pinneri2021:strong-policies, title = {Extracting Strong Policies for Robotics Tasks from Zero-order Trajectory Optimizers}, author = {Pinneri*, Cristina and Sawant*, Shambhuraj and Blaes, Sebastian and Martius, Georg}, booktitle = {The Ninth International Conference on Learning Representations (ICLR)}, month = may, year = {2021}, note = {*equal contribution}, doi = {}, url = {https://openreview.net/forum?id=Nc3TJqbcl3}, month_numeric = {5} } |