This New AI Can Find Your Dog In A Video! 🐩

Показать описание

📝 The paper "MTTR - End-to-End Referring Video Object Segmentation with Multimodal Transformers" is available here:

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.

Thumbnail background image credit:

Károly Zsolnai-Fehér's links:

Рекомендации по теме

Комментарии

I would like to be able to point my camera at the forest and have it pick out and highlight any animals in the picture or any mushrooms or any specific types of plants for me so they can just walk up to those plants or animals or whatever

ChadKovac

bit confused. All of these videos were of two subjects with clear distinctions, yet the search key for selecting them had these elaborate highly specific descriptions. For instance the search keys 'Man' and 'Surfboard' would have been completely sufficient for selecting the two. Is the AI actually capable of understanding these complex descriptions or were all those descriptors unnecessary fluff to make it seem more impressive? if I gave it the same surfing video but with 5 more surfers all with different appearances and different colored boards, would it actually select the correct one? If so, these videos were poorly picked to demonstrate that imo. Also admittedly a bit confused with this being compared to pose estimation. Image segmentation and pose estimation are rather tenuously related. Odd comparison.

radpugguy

I see this being used to make a comprehensive database of all human video searchable. Something akin to what google did a few years ago with scanning every book they could get their hands on. With youtube already under their Alphabet umbrella, they have a big starting database.

felekar

Whoa!
I've said it before and I'll say it again: Multimodal AIs are the future
This is insane!
I wanna see all these things (This one, CLIP / CLOOB and/or Dall-E, CM3 and more) combined into one big end to end any-modality-to-any-modality

Kram

Amazing! This can be used in robotics to make robots really smart, e.g. "Jarvis, bring me a bottle of water"

michaelvechtomov

Just imagine, the end of rotoscoping. I could tell the video editing program what I wanted to cut out and it would rotoscope a mask of exactly what I wanted, literal hours of busy work saved

willnine

This will be such huge plus to add into getty images or any other stock footage companies to detect exactly what a person needs

InnoSang

Can you imagine what this implementation could do in the medicine world??? For example, if we use it as a predictive model to know the progression of a mobility debilitating issue, we could catch it early on without using invasive techniques, making diagnostic medicine so much affordable
Mind blowing

Mufasa

Find Crouching Tiger, hidden dragon.

I can’t wait for a generative version of this:
"Add Dr Károly, holding onto his papers"

Will-ktjk

The Hungarian accent has really grown on me over the past year. It's so endearing! I remember how the first video I watched drew me in and it was the first one on the hide and seek A.I.

jfk_the_second

Im confused by the demo-videos, because they dont show the "extra" description being used in any way at all. you could say "man", "skateboard", "racket" etc. and it would still find whats there, i think the demo is really lacking

SeyHan

Are there examples of them telling it to look for the wrong thing? Like the examples where you give a specific color if you have the wrong color would it not highlight it as expected or are we just being shown the correct way?

robertrynard

With every video, the "yea, we are simulated" becomes more apparent

ТуанНгуен-ьп

Alexa bring me the yellow surboard from the left!

ccosmin

This will be/is a great tool for autonomous driving and AI applications

beautolan

If I'm not mistaken, the technique in this paper doesn't do tracking or pose estimation, but rather a segmentation. That's a bit different task.

Vassay

Does this work for determining whether an object actually is within a video? Or does it just apply the colouring to whatever it thinks is closet. If it is the former, this could be great for sifting through video data, say security footage to find occurences of events.

shayboual

Can't wait for the next paper: "replace the person riding a bike by a capybara wearing a hat and sunglasses"

pasikavecpruhovany

Wow, end-to-end trivial to learn surveillance!

StephenRoseDuo

I'm not actually questioning the validity of the software here, more like questioning the efficacy of the demonstrations. Lets say at 4:19, the thing we are searching for is "a tennis racket in the hand of a player with a red skirt", however simply going of this video for all we know it is only tracking a tennis racket because there are no things in the scene that it would really need that prompt to differentiate between. Aside from that, it really is awesome

blankblank

This New AI Can Find Your Dog In A Video! 🐩

This New AI Can Find Your Dog In A Video! 🐩

This NEW AI Tool Can Replace Your SEO Agency

This New AI Animation Tool is a Gamechanger

This New AI Image Generator Beats Midjourney!

Now It’s Really Scary… New FLUX AI RAW Shocked The World!

Bolt.New + Ollama: AI Coding Agent BEATS v0, Cursor, Bolt.New, & Cline! - 100% Local + FREE!

This new AI image generator is ranked #1

7 New AI Tools You Won't Believe Exist

Forget Wikipedia – This AI Tool is Next-Level for Research!

TERRIFYING AI “Party People”😳 #viral #ai #shorts

This AI says it's conscious and experts are starting to agree. w Elon Musk.

POV: someone saw your character ai chats #characterai

The New 'AI Minecraft' game everyone is talking about right now.

I Challenged My AI Clone to Replace Me for 24 Hours | WSJ

Ranking The Best AI Image Generation Tools

This Unknown AI Will Blow Your Mind! BYE CHATGPT…

Elon Musk's Prediction for AI Future

AI-Generated Music is Wild!

Which jobs will AI replace first? #openai #samaltman #ai

Top 12 AI Stocks That Could 10x Soon (NVIDIA 2.0)

DISTURBING THINGS SAID BY A.I. #Shorts

Is this the future of technology?! Humane AI looks INSANE! 🤯

How to Earn $1,250/Day with AI YouTube Shorts for FREE (Make Money Online)

Artificial Intelligence can NEVER be Sentient?! #technology #artificialintelligence #ai