Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning (SciRob 23)

preview_player
Показать описание
A central question in robotics is how to design a control system for an agile, mobile robot. This paper studies this question systematically, focusing on a challenging setting: autonomous drone racing. We show that a neural network controller trained with reinforcement learning (RL) outperforms optimal control (OC) methods in this setting. We then investigate which fundamental factors have contributed to the success of RL or have limited OC. Our study indicates that the fundamental advantage of RL over OC is not that it optimizes its objective better but that it optimizes a better objective. OC decomposes the problem into planning and control with an explicit intermediate representation, such as a trajectory, that serves as an interface. This decomposition limits the range of behaviors that can be expressed by the controller, leading to inferior control performance when facing unmodeled effects. In contrast, RL can directly optimize a task-level objective and can leverage domain randomization to cope with model uncertainty, allowing the discovery of more robust control responses. Our findings allow us to push an agile drone to its maximum performance, achieving a peak acceleration greater than 12 g and a peak velocity of 108 km/h. Our policy achieves superhuman control within minutes of training on a standard workstation. This work presents a milestone in agile robotics and sheds light on the role of RL and OC in robot control.

Reference:
Y. Song, A. Romero, M. Müller, V. Koltun, D. Scaramuzza,
"Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning",
Science Robotics, September 13, 2023

For more info about our research on:

Affiliations:
Y. Song, A. Romero, and D. Scaramuzza are with the Robotics and Perception Group, Dep. of Informatics, University of Zurich, and Dep. of Neuroinformatics, University of Zurich and ETH Zurich, Switzerland
M. Müller and V. Koltun are with Intel Labs

Рекомендации по теме
Комментарии
Автор

Whereas I am not an expert on either, it seems to be a matter of definitions. Yes, if you impose a trajectory as target for optimal control, you are solving an inverse dynamics problem, which means multi-order differentiation. This significantly limits the scope of feasible solutions on an input level, yet it leaves you with a more tangible optimisation problem (to find a global optimum within the remaining scope of solutions). However, it is surely possible to also define an optimal control problem on a task level. This enlarges the scope of solutions, and likely exposes many local optimums that are better than the global optimum of the formerly described trajectory-based optimisation problem. However, now you have an optimisation problem that is no longer tangible on a global level. This requires you to settle on (quasi-)local optimisation approaches, of which reinforcement learning is a fine example.

sepdriessen
Автор

Well done guys 👌👍 It's been a long way, happy that I could also be a part of it before my retirement (as a Drone Racing Pilot) 😉 best Regards Kay

ArkoN
Автор

Wow! This is really cool stuff! What splendid automation 😮 Thank you algorithm for taking be here 😊

LukeVader
Автор

Waw.... Although I was already sure that control theory can push human limits as far as he can imagine, but this is literarily speechless....

thecontrolenggeek
Автор

Next step: add weather conditions. Damn this was incredible!

limbo
Автор

Congratulations, that is a really extraordinary feat.

Gosuminer
Автор

Interesting, I wonder if it's possible to integrate OC techniques with DQN, to maintain the efficiency but make it amenable to formal verification.

bourr
Автор

Impressive, but also a bit scary. Shouldn't have read "Kill Decision" by Daniel Suarez before watching this.

flwi
Автор

Amazing!. Imagine when future humanoid robots are having their own olympics because they surpass humans. Probably in the next 2 decades.

sam.rodriguez
Автор

I think these human racers are really good. The robot just gets 0.5s faster (10%)

aaxa
Автор

One day, I dream I can achieve that. Damn! 😆

fanshi
Автор

Music is too loud. You'd be better off without music, but at least turn it down to background levels.

rogerbye
Автор

You can see in the young kid's eyes that he just realized he's out of job :(

rocketmike