Author
Merel, J
Hasenclever, L
Galashov, A
Ahuja, A
Pham, V
Wayne, G
Teh, Y
Heess, N
Last updated
2021-11-11T13:16:41.92+00:00
Abstract
We focus on the problem of learning a single motor module that can flexibly
express a range of behaviors for the control of high-dimensional physically
simulated humanoids. To do this, we propose a motor architecture that has the
general structure of an inverse model with a latent-variable bottleneck. We
show that it is possible to train this model entirely offline to compress
thousands of expert policies and learn a motor primitive embedding space. The
trained neural probabilistic motor primitive system can perform one-shot
imitation of whole-body humanoid behaviors, robustly mimicking unseen
trajectories. Additionally, we demonstrate that it is also straightforward to
train controllers to reuse the learned motor primitive space to solve tasks,
and the resulting movements are relatively naturalistic. To support the
training of our model, we compare two approaches for offline policy cloning,
including an experience efficient method which we call linear feedback policy
cloning. We encourage readers to view a supplementary video (
https://youtu.be/CaDEf-QcKwA ) summarizing our results.
Symplectic ID
949226
Download URL
http://arxiv.org/abs/1811.11711v2
Publication type
Conference Paper
Publication date
6 May 2019
Please contact us with feedback and comments about this page. Created on 02 Dec 2018 - 17:30.