Puze Liu, Jonas Günster, Jan Peters, Davide Tateo
German Research Center for AI
TU Darmstadt
Robot Parkour (2023), Qi Zhi Lab, Stanford
Robot Soccer (2023), Deepmind
Humanoid Backflip (2024), Unitree
Robot Parkour (2023), Qi Zhi Lab, Stanford
Robot Soccer (2023), Deepmind
Humanoid Backflip (2024), Unitree
$$ \begin{align*} \max_{\pi} \quad & \mathbb{E}_{\tau \in \pi} \left[ \sum_{t}^{T} \gamma^t r(\vs_t, \va_t) \right] \\ \mathrm{s.t.} \quad & k(\vs_t) < 0 \end{align*}$$
Robot dynamics $\dot{\vs} = f(\vs) + G(\vs) \vu_s $
ATACOM - Acting on the TAgent Space of COnstraint Manifold
Safe set $$\quad \mathcal{C} = \{\vs \in \mathcal{S} | k(\vs) < 0 \}$$
Constraint manifold in the augmented state space $$ \MM = \{(\vs, \vmu) \in \mathcal{S} \times \mathbb{R}^{+} | c(\vs, \vmu) \coloneqq k(\vs) + \vmu = 0\} $$
Safe set $$\quad \mathcal{C} = \{\vs \in \mathcal{S} | k(\vs) < 0 \}$$
Constraint manifold in the augmented state space $$ \MM = \{(\vs, \vmu) \in \mathcal{S} \times \mathbb{R}^{+} | c(\vs, \vmu) \coloneqq k(\vs) + \vmu = 0\} $$
Velocity tangent to the manifold $$\mathrm{T}_{(s, \mu)}\MM =\left\{ (\dot{\vs}, \dot{\vmu}) | \dot{c}(\vs, \vmu) = \begin{bmatrix} \mJ_k & \mathbb{I} \end{bmatrix} \begin{bmatrix} \dot{\vs} \\ \dot{\vmu} \end{bmatrix} = \vzero \right\}$$
Acting on the TAngent space of the COnstraint Manifold (ATACOM)
$\begin{bmatrix} \vu_s \\ \vu_\mu \end{bmatrix} = \textcolor{b2e061}{\underbrace{-\mJ_u^{\dagger} \vpsi}_{\text{Drift Comp.}}}$ $\textcolor{#fd7f6f}{\underbrace{- \lambda \mJ_u^{\dagger} \vc}_{\text{ Contraction }}}$ $\textcolor{7eb0d5}{\underbrace{+ \mB_u \va }_{\text{ Tangential }}} $
Training setup in the real world
Success rate
Simulation | 86% |
Zero-shot transfer | 12% |
Fine tuning | 71% |
Hitting Velocity
Simulation | 0.92m/s |
Zero-shot transfer | 0.97m/s |
Fine tuning | 0.97m/s |
Mobile Robot with Differential Drive
15 Moving Obstacles
Human Robot Interactions
Simulated Scenario
ATACOM: Safe Exploration on the Tangent Space of the Constraimt Manifold
Safe Exploration in Dynamic Environment
Learning Safe Policy in the Human Robot Interaction Scenario