DeepMind's AI Learns Object Sounds | Two Minute Papers #224

preview_player
Показать описание
The paper "Objects that Sound" is available here:

Our Patreon page with the details:

One-time payment links are available below. Thank you very much for your generous support!
Bitcoin: 13hhmJnLEzwXgmgJN7RB6bWVdT7WkrFAHh
Ethereum: 0x002BB163DfE89B7aD0712846F1a1E53ba6136b5A
LTC: LM8AUh5bGcNgzq6HaV1jeaJrFvmKxxgiXg

Recommended for you:

We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Andrew Melnychuk, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Dave Rushton-Smith, Dennis Abts, Emmanuel, Eric Haddad, Esa Turkulainen, Evan Breznyik, Frank Goertzen, Kaben Gabriel Nanlohy, Malek Cellier, Marten Rauschenberg, Michael Albrecht, Michael Jensen, Michael Orenstein, Raul Araújo da Silva, Robin Graham, Shawn Azman, Steef, Steve Messina, Sunil Kim, Torsten Reil.

Károly Zsolnai-Fehér's links:
Рекомендации по теме
Комментарии
Автор

Our Patreon page with the details:

One-time payment links are available below. Thank you very much for your generous support!
Bitcoin:
Ethereum:

TwoMinutePapers
Автор

you know something is god damn cool when even though you practice machine learning you don't understand how this was possible.

polares
Автор

There goes our ability to blame someone else for a fart. These AI are a real threat to humanity indeed.

pierrecousyn
Автор

What would be amazing is if a future version of this technique can isolate sounds coming from specific locations! For instance, if you had a video with two people talking simultaneously, you could say "filter out any audio that corresponds to the right half of the screen" and then only be left with the voice of one person! Machine learning (especially as it pertains to music and audio and video) is insane.

legoalien
Автор

Can it locate the sound source from a video of an object that doesn't produce movement? For example, a video of a room with static objects, one of which is a radio that plays music.

mitkoogrozev
Автор

This is amazing. After watching the AI video interpolation video and then this one, it seems pretty likely that AI is going to have the capability to do some wildly incredible things with our video and audio data in the near future.

CalvinRRC
Автор

Hello, incredible work, and if it wasnt for you ive never would have known of this, please, if you have contact to the people that makes this work tell them to train the computer to diferenciate good and bad, or to have a logic that surpases that but inclined to the wellbeing of all beings, also they should create a chatbot that gather all this amazing deep learning on all the internet, but has a line to follow in terms of moral

TheGraphiteCovenant
Автор

The ultimate test: running these against StSanders shreds

syksaje
Автор

lol, someone should make it do 3d visualization of echolocation. make it do a reconstruction of the place it was recorded in. NSA intensifies!

nonameplsno
Автор

That's genius, I wish it will solve the movie translation problem automatically 😂

AmmarTaicho
Автор

This is cool. Would love to learn more.

intisarchy
Автор

Are there any similar papers that focus on identifying timbres within audio that has more than one source of noise? E.G. identifying individual instruments in a song. And thanks Károly for all your wonderful videos! :)

Michaelvaleriani
Автор

how well does it work, when there are multiple things moving in the scene but only one is producing sound. This way it kind of looks like weter the neural network just highlights the parts with the most movement.

ymi_yugy
Автор

You would be a great addition to steemit 💡☺️

leapmind
Автор

Do the highlighted features represent sound sources? I thought they represented detected pixel features strongly associated with detected acoustic features. I believe the guitar's neck was highlighted as a recurrent feature in guitar music, not as prediction of the physical source of a guitar's sound.

jonmichaelgalindo
Автор

does anyone know practical application of Z-transform. .is it related to programming as code is also kind of digital system, isn't it?

jaikumar