DeepMind's AI Learns The Piano From The Masters of The Past

preview_player
Показать описание
The paper "The challenge of realistic music generation: modelling raw audio at scale" is available here:

We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
313V, Andrew Melnychuk, Angelos Evripiotis, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Dennis Abts, Emmanuel, Eric Haddad, Esa Turkulainen, Geronimo Moralez, Kjartan Olason, Lorin Atzberger, Marten Rauschenberg, Michael Albrecht, Michael Jensen, Morten Punnerud Engelstad, Nader Shakerin, Owen Skarpness, Rafael Harutyuynyan, Raul Araújo da Silva, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga.

Two Minute Papers Merch:

Károly Zsolnai-Fehér's links:
Рекомендации по теме
Комментарии
Автор

Episode 34, 002: AI Learns to create 2 minute paper video

Hello fellow scholars and this is karlo jahleed fahir.

This is absolutely incredible work and it is breathtaking that with only such a few samples the AI has learnt to create an almost perfect replica of one of my videos.

What a time to be alive, and see you guys, next time.

kanewilliams
Автор

Imagine this connected to a machine on your head which checks how you feel so it can create the perfect music for you

kebomueller
Автор

Ok this is quite interesting.
First of all I think the results are not that impressive, at least not in the way that they manage to have structures longer than ~2 seconds.
It is extremely good at very short term composition, on the length of a half or one bar. Better at that than any other composition algorithm I know.
In terms of structure it still falls short though. The AI keeps starting really beautiful phrases that have a lot of potential but it doesnt manage to hold a consistent thought for more than a few seconds, so it constantly feels like it just fell short of having one, nice and beautiful phrase. It's close to it though, it just needs to have 3-4 times longer structures imo.
However there was NO structure over a whole piece present at all (or even just 10 seconds), and in that regard the results are worse than what Ive seen previously by other algorithms.
Generally there are too many notes and too few repetitions.
Still good stuff though.

Wegnerrobert
Автор

This AI didn't just learn to play the piano - it learnt to sing it! The fact that these are synthetic samples is mind boggling, I missed what you said for a second and I thought for sure that I was listening to real samples from the database.

MobyMotion
Автор

Honestly I'm not very impressed. The music it creates appears almost random. Yeah, it's picking up on some basic harmonies and chords, but that in itself is not very impressive, I think.

I'm curious when they'll manage to actually make it expressive and coherent!

lukasdon
Автор

The crazy thing to me about this isn't really the music (it's still too random to be pleasant for longer than 10 seconds) but the fact that it's doing this with raw audio data. The fact that the result sounds like someone playing a piano is _very_ impressive to me.

simoncarlile
Автор

Well to be honest. It still does not really sound like music. No real melody, no play with dynamic, no theme or anything. It just seems to play some harmonies.

ymi_yugy
Автор

Somehow, the fact that Chopin comprised the highest percentage of material studied makes sense when listening to the clips.

curtishammer
Автор

Is it just me that cannot really understand what the ai is playing. I can barely hear any melody in the generated music.

freddychen
Автор

Is the distortion in the AI music playback due to the fact that the AI synthesized it? Or is the noise from elsewhere?

thomassynths
Автор

Definitely a long way from even average-level human composition, but still I think this is really impressive that it's learned to perform so well tonally (very little dissonance, manages to stay in key somewhat). The fact that it's learned all that practically from first principles (raw audio, not score) is what makes this so different compared to previous computer-generated composers.

jakejakeboom
Автор

Makes me wonder what applying different levels of changes could do to improve the results. Kind of like a Perlin noise where there are different octives, it does a first slow pass to determine chord progression and a faster path to determine the melody built off of this chords.

GrantNelson
Автор

as a composer, I think that the days of my profession are numbered :) even though now the results of AI attempts to recreate music performance sounds pretty weird and not well structured, but remember - the technologies are developed not linearly, but exponential.

shrammstorm
Автор

Now fit this ai with all sorts of music styles, not just classical. And see what comes out.

rogerab
Автор

It doesn't just learn how to play the piano. It isn't just deciding what time and how hard to hit what keys. It's generating the waveform itself. That means it not only plays the piano, it's the piano it's playing as well.

williambarnes
Автор

Writing music is easy. Writing great music that millions want to listen to is a different level.

thealexanderbond
Автор




Contrastive predictive coding is very interesting, as is the VQVAE, as steps towards increasing the ability model long term dependencies. There's a huge design space to explore for neural audio synthesis and I've been doing a lot of research in this area, a huge passion of mine.

Conditional generative models give us the ability to control these neural audio dreams and open a new world of potential applications.

RichardAssar
Автор

Still lacks highest level structure, like intro, culmination, etc. Samples are too short to even fit it.
I think there is a fundamental problem here: music is all about emotion transfer, it's not about pretending to be a professional musician. And we don't have a labeled dataset of music. I think that to make a good musical AI we should give the machine not only pieces of music, but also EEG of people who listen that music. Or at least we should label which moments contain which emotions. We can talk about real creativity only after it will learn the concept of emotions.

luck
Автор

It definitely sounds like autoencoder. It sounds blury, which is often characteristic of autoencoders. They should try using GAN.

The music generation will be very difficult, because we need long term dependencies like in text and short term dependencies something like texture in images. So we would need mix of recurent and convolutional networks.

FlyingOctopus
Автор

IDK sounded a bit awkward to me, there didn't seem to be an overall "story" behind the music.

dlbattle