Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning (SciRob 23)

Показать описание

A central question in robotics is how to design a control system for an agile, mobile robot. This paper studies this question systematically, focusing on a challenging setting: autonomous drone racing. We show that a neural network controller trained with reinforcement learning (RL) outperforms optimal control (OC) methods in this setting. We then investigate which fundamental factors have contributed to the success of RL or have limited OC. Our study indicates that the fundamental advantage of RL over OC is not that it optimizes its objective better but that it optimizes a better objective. OC decomposes the problem into planning and control with an explicit intermediate representation, such as a trajectory, that serves as an interface. This decomposition limits the range of behaviors that can be expressed by the controller, leading to inferior control performance when facing unmodeled effects. In contrast, RL can directly optimize a task-level objective and can leverage domain randomization to cope with model uncertainty, allowing the discovery of more robust control responses. Our findings allow us to push an agile drone to its maximum performance, achieving a peak acceleration greater than 12 g and a peak velocity of 108 km/h. Our policy achieves superhuman control within minutes of training on a standard workstation. This work presents a milestone in agile robotics and sheds light on the role of RL and OC in robot control.

Reference:
Y. Song, A. Romero, M. Müller, V. Koltun, D. Scaramuzza,
"Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning",
Science Robotics, September 13, 2023

For more info about our research on:

Affiliations:
Y. Song, A. Romero, and D. Scaramuzza are with the Robotics and Perception Group, Dep. of Informatics, University of Zurich, and Dep. of Neuroinformatics, University of Zurich and ETH Zurich, Switzerland
M. Müller and V. Koltun are with Intel Labs

Рекомендации по теме

Комментарии

Whereas I am not an expert on either, it seems to be a matter of definitions. Yes, if you impose a trajectory as target for optimal control, you are solving an inverse dynamics problem, which means multi-order differentiation. This significantly limits the scope of feasible solutions on an input level, yet it leaves you with a more tangible optimisation problem (to find a global optimum within the remaining scope of solutions). However, it is surely possible to also define an optimal control problem on a task level. This enlarges the scope of solutions, and likely exposes many local optimums that are better than the global optimum of the formerly described trajectory-based optimisation problem. However, now you have an optimisation problem that is no longer tangible on a global level. This requires you to settle on (quasi-)local optimisation approaches, of which reinforcement learning is a fine example.

sepdriessen

Well done guys 👌👍 It's been a long way, happy that I could also be a part of it before my retirement (as a Drone Racing Pilot) 😉 best Regards Kay

ArkoN

Wow! This is really cool stuff! What splendid automation 😮 Thank you algorithm for taking be here 😊

LukeVader

Waw.... Although I was already sure that control theory can push human limits as far as he can imagine, but this is literarily speechless....

thecontrolenggeek

Next step: add weather conditions. Damn this was incredible!

limbo

Congratulations, that is a really extraordinary feat.

Gosuminer

Interesting, I wonder if it's possible to integrate OC techniques with DQN, to maintain the efficiency but make it amenable to formal verification.

bourr

Impressive, but also a bit scary. Shouldn't have read "Kill Decision" by Daniel Suarez before watching this.

flwi

Amazing!. Imagine when future humanoid robots are having their own olympics because they surpass humans. Probably in the next 2 decades.

sam.rodriguez

I think these human racers are really good. The robot just gets 0.5s faster (10%)

aaxa

One day, I dream I can achieve that. Damn! 😆

fanshi

Music is too loud. You'd be better off without music, but at least turn it down to background levels.

rogerbye

You can see in the young kid's eyes that he just realized he's out of job :(

rocketmike

Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning (SciRob 23)

Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning (SciRob 23)

The Simple Solution to Traffic

Self-Driving Car Has Reached A Whole New Level Of Autonomous

[ICRA21 Autonomous Racing] - Alex Liniger: Pushing the Limits of Friction: A Story of Model Mismatch

ICRA 2022 Autonomous Racing - Jonathan Goh: Beyond the Limits-The How and Why of Autonomous Drifting

Goku masters Autonomous Ultra Instinct - Dragon Ball Super Episode 129 English Dub

Learning High-Speed Flight in the Wild (Science Robotics, 2021)

Dragon Ball Super 129 A Transcendent Limit Break! Autonomous Ultra Instinct Mastered!

Learned Inertial Odometry for Autonomous Drone Racing (RAL 2023)

Autonomous Drone Racing with Deep Reinforcement Learning (IROS 2021)

Autonomous agents are WILD... | Multi On Tutorial

Time-Optimal Planning for Quadrotor Waypoint Flight (Science Robotics 2021)

Learned Inertial Odometry for Autonomous Drone Racing (RAL 2023 narrated)

Rule Based Systems vs Machine Learning in Autonomous Driving | Tokyo Bootcamp Tech Talk

The Edge of Disaster: A Battle Between Autonomous Racing and Safety

Dynamics Modeling using Visual Terrain Features for High-Speed Autonomous Off-Road Driving

Learning to Handle Autonomous Vehicles at the Limit

Levels of Autonomous Driving - Level 0 1 2 3 4 5 - what is what? Example Mercedes S-Class Prototype

Autonomous Vehicle Takes Me to Stanford Hospital

Tech Stack in Autonomous Racing

Test Talks: Why Autonomous Vehicles are Failing

The Two Opposite Futures of Self-Driving Cars

How Self Driving Cars Work | How Autonomous Vehicles Work | AI | Intellipaat

Autonomous Drones in the 2030s