Lecture 18: MIT 6.800/6.843 Robotics Manipulation (Fall 2021) | 'Reinforcement Learning (Part 1)'

preview_player
Показать описание
Рекомендации по теме
Комментарии
Автор

Lecture 19 is missing from the playlist

teetanrobotics
Автор

How does the discussion on sampling and your recent paper on Randomized Smoothing tie in to work on "sampled differential dynamic programming" by Rajamäki et al? If I recall correctly, there's mention in that paper about using samples to fit a covariance matrix, which is equivalent to fitting the Hessian of a second-order approximation of a DDP cost function. You mention this sampling approach possibly being behind a lot of the successes in RL policies -- it's interesting to consider what it could do for the planning side with DDP / MPC.

I certainly agree with the statement on these sampling methods coming back into vogue due to better GPUs. Coming from the locomotion field, many of our DDP approaches are compute-limited partly by gradient computation on the CPU (which aren't getting much faster), it is food for thought that maybe with increasingly powerful GPUs we can just give up on a lot of model structure and instead power through with parallelisation. But as you mention, the constraint handling is still more of a pain, and I still think we're pushing the edge of the envelope with regards to dimensionality issues.

jamesfoster