Sound Recognition - Computerphile

preview_player
Показать описание
How do you go about making a device recognise individual sounds? Audio Analytic's Dr Chris Mitchell on how they approached the problem.

Audio Analytic is a sound recognition software company based in Cambridge UK & Palo Alto USA.

Dodgy sound effects courtesy Sean's Roland electronic drum kit...

This video was filmed and edited by Sean Riley.

Рекомендации по теме
Комментарии
Автор

My CS Thesis is on Sound Recognition. This stuff gets so abstract so quickly it gets really hard to explain to people. He did a better job than I could!

iau
Автор

3:13
Honestly, the door squeak sound effect you have is great.

Architector_
Автор

7:27 “Last time I checked my window for breaking it didn't speak to me”

You just didn't understand it.

stensoft
Автор

never thought I would hear Tyler the Creator in a computerphile video

guyman
Автор

SlipKnot music in a computerphile video? I never thought I'd see that

AmxCsifier
Автор

i would really prefer a video on speech recognition

realeques
Автор

Very nice collection of vulnerable devices in the back.

Ivo--
Автор

Summary of the whole 15min: "we have to cut the sound precisely, and feed it to the computer". Wrong title maybe.
And I'm no expert in sound recognition, but this makes it sound like most of it happens in the time domain rather than the freq domain, that sounds too doubtful to me.

anothergol
Автор

i am a hobbyist music maker, and I have kind of created my own dictionary around describing sounds. for example, "smooth, deep, dark blue, hollow, rubbery", or "smooth, sharp, narrow, high, reddish/brown", and such :-D
they would be highly insufficient for sound recognition purposes, but I would not be surprised if they were at least partially overlapping with the features they used. as far as i know, many people who make music, at least those who make it on computer so they have to think about these things consciously, have similar kinds of more or less developed dictionary to describe sound, melody, and rythm features.

MidnightSt
Автор

Great video and great explanation.
It would get tricky when you need to create a database of glass breaking. Coz you actually have to break all these different types of glass. Could get expensive very quickly.

lindascoon
Автор

What's easier for a computer, sound recognition or image recognition?

Cesariono
Автор

I was disappointed by this video. This is an interesting topic, for which there were really two viable approaches for an entertaining 15-minute video: A high-level overview of the technology this guy developed, with diagrams of the parts of a specific use case or device, "how it works" style; or the first video in a series like you've done with Dr. Pound that cover prerequisite fundamental concepts first, and build up to a more in-depth examination of an end-to-end example. This video isn't either; it's 15 minutes of a guy speaking at such a high-level of abstraction that nothing is actually explained, because he ostensibly thinks that computerphile viewers aren't sophisticated enough to understand his work, as is made evident by his condescending remark at 8:33 - 8:40. How about an example of a feature that IS useful, and some examples of what exactly the metrics are, and what sounds it can be used to differentiate?

shiphorns
Автор

Any chance we could get links to papers/blogs/projects on the subject ? I would really like to try out hacking on sound recognition especially on a low power device.

DeathTickle
Автор

A bit more detail about speech recognition would have been useful. Perhaps a future video?

Lolwutdesu
Автор

I worked on early DSP-based speech recognition used in telephony come a long never really imagined talking to your TV remote would be a common thing.

realcygnus
Автор

Will there be more videos like this? I'd love to see some more in depth stuff, I jibbed out of going to university to do this sort of thing, bags of regret around that now!

ipg
Автор

What do I see on the desk? 132 column LINE PRINTER PAPER? Does that still exist in 2017?

jlinkels
Автор

15 mins of virtually nothing and its edited version/

Dima-htrb
Автор

I would think designing a security system based on audio would be challenging as an attacker could simply jam the classifier with extremely loud, rather short duration wideband noise to temporarily raise the noise floor. I can think of a few mitigations, but I'd imagine it'd be pretty hard to design a system with audio as the only sensor.

kylegreen
Автор

it is incredibly meta to be watching this with Google-generated subtitles.

kowalityjesus