AI-Based 3D Pose Estimation: Almost Real Time!

preview_player
Показать описание
📝 The paper "3D Human Pose Machines with Self-supervised Learning" and its source code is available here:

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
313V, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, Jason Rollins, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Lorin Atzberger, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Morten Punnerud Engelstad, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Richard Reis, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga, Zach Doty.

Károly Zsolnai-Fehér's links:
Рекомендации по теме
Комментарии
Автор

Gait recognition is certainly becoming much easier.
I just want a good body tracker for VR.

kebakent
Автор

I think there are many more mass-market applications of this than are mentioned in the video. For example, broadcasting sports events. Instead of broadcasting a plain video of the event, the poses of the athletes are estimated and broadcast. They are then reskinned on the viewers' screens using "skins" resembling the athletes. The viewers are then able to watch the game from any angle and in any resolution their computer can handle. Sure, it wouldn't look photorealistic, but a generation used to Fortnite and Minecraft might not care.

DontfallasleeZZZZ
Автор

You mention 51ms for a single frame, but I assume if you were doing this on a real time feed, the algorithm could be significantly optimized using temporal smoothness since the pose changes very little between two frames.

ehsan_kia
Автор

Currently for AAA quality mocap you need 16 to 32 (or more) cameras in a perfectly well lit room, cable managed to a server which connects to your workstation
if it were just 1 camera the cost would be so much lower, and less room for error with fewer moving parts. perhaps you could train the algorithm with a second camera angle for added speed and reliability.

Curt-
Автор

This can help indie 3D game developers to create good mocap animations under budget.. Great invention!

alialtaf
Автор

Great, no need for expensive off-site motion capture studios and their limiting schedule constraints.

nononono
Автор

This is huge! Always good stuff from the channel.

ryanbrown
Автор

Could this allow google translate for sign language?

briandiehl
Автор

Very powerful stuff here. It would be interesting to see this technology applied to a game of soccer / football. The general goal of every player is known. Analyzing postures in a dynamic system could provide an agent with the information needed to best avoid the most likey threats to its current goal.

brigfiche
Автор

Why no videos examples? Is there a problem with temporal consistency?

Dragonblood
Автор

This is really cool! But please note that latency and fps (frames per second) are not related. Even though the latency is 51 ms, a fast computer can produce output at any desired frame rate. It is just a question of parallelism.

henriksundt
Автор

Why do you mention the miliseconds it takes but don't mention the resolution or hardware used to achieve that time?

timgo
Автор

This almost seems better than the kinect and it's not even using IR lights for depth. It just needs to be a bit faster.

john_hunter_
Автор

Can we use this for GAIT Recognition? I mean estimating the pose and then could we construct a GAIT Energy Image based on that?

rushirajparmar
Автор

It's obvious application is going to be military

SiddharthKulkarniN
Автор

Isn't there a video out there of these poses being fed in and used to generate CGI output? Anyone know what I'm talking about and have a link?

MatthiasTTV
Автор

Hahahaha please someone build this into a slouch detection algorithm for any of us spending hours on the computer

MobyMotion
Автор

51ms per scene while batch predicting 1000 scenes, versus 51ms for each independent scene, are drastically different performance numbers. You can't speed up real world inputs in real-realtime processing, unless you create a "buffer" and wait for the data to accumulate to a batch. But then it wouldn't really be realtime.
Reason I'm saying this: Recently I've coded an AI that can make thousands of predictions in tens of microseconds. Then when I pulled it to production, and have user requests come in one by one, it took literally *seconds* per sample. I'm so fired...

deep.space.
Автор

where can i get the algortihm
and code

RaselAhmed-ixee
Автор

Hi, is there any software that uses camera tracking or motion capture to create variety of animsets for games? I want to make a game myself with AI's help in mind and I need references for how to make games with AI. You could say it's somewhat my goal to do it. Thanks...

MILADISGONE