But can China's new AI write a Good Tune?

preview_player
Показать описание
Pitting DeepSeek against ChatGPT on a series of increasingly complicated musical challenges.

Рекомендации по теме
Комментарии
Автор

It should be said that the main innovation of DeepSeek is not to improve on output, but rather to drastically reduce the required hardware needed to output something in the same ballpark as current AI models. It sacrifices a little bit of quality in exchange for being more affordable, more accessible to non-profit organizations, universities, research institutes, and having less negative impact on the environment. That's a trade-off I'll gladly accept any day.

imveryangryitsnotbutter
Автор

It was DeepThink R1 that was worth of attention, because that is the reasoning model that is able to compete with o1 whilst being open-source. That is why at 2:40 you can't read what the model is actually "thinking" like when you used R1, just a summary of the thinking. Asking o1 what the internal monologue was is one of the few ways a paying customer can get permanently banned from the service.

MrDanMaster
Автор

I feel that Deepseek has more "Character" compared to ChatGPT. What I mean by character is Deepseek is more stubborn, deepseek will not try to correct errors except if it is obvious. So if you want something good, you should just start a new chat and then ask Deepseek the same question again but ask it to evaluate what it has produced previously by giving it its previous creation.

I have tried deepseek many times and then until a certain number of tries, it will give up and refuse to give you any more new answers. That's why I think deepseek is better, hahaha, it is more like a human.

franklee
Автор

"can a robot write a symphony?"
"yes I can, can you?"
from the director and producer of I, robot
We, robots on theatres soon

questtech
Автор

8:09 "cher ami, you grasp the storm but forget the poetry. quartal harmonies need not be barbaric— make them sing. where is the rubato? the bass crawls where it should dance. and this ending— mon dieu, it stops rather than dissolves!"
LOL

johnchessant
Автор

you should've turned on the Deepthink button from the first prompt, as they are calling for different model (it's like you started from gpt4o then let gpto1 to pickup the previous jobs)

nahlene
Автор

7:30 Keep the prompt concise. Avoid words such as: Try, Oh, Please, Can you, Would you, Could you. Instead use a direct language, be dominant with the model, it tends to produce better results. Here is a better version:

Write a fast, dramatic miniature Chopin-style prelude, featuring quartal harmony instead of triadic harmony. Incorporate aspects of Chopin's writing, reduce superficial elements. Decide the key (except C) and the time signature. And give me a list of beats on which to change pedal.


From DeepSeek R1 paper: Prompting Engineering: When evaluating DeepSeek-R1, we observe that it is sensitive
to prompts. Few-shot prompting consistently degrades its performance. Therefore, we
recommend users directly describe the problem and specify the output format using a
zero-shot setting for optimal results.

It's better to start new chats rather than continue existing ones and keeping the prompt as direct as possible. Just FYI, I hope it was helpful! Happy seeking :)

Akuma.
Автор

You should try Gemini models. They are technically multi-modal (understand multiple modalities such as text, images/video, and audio). I wonder if that audio understanding would help.

TheGuyWhoGamesAlot
Автор

I'm afraid real Chopin would just have a stroke.
I agree, it's 'kinda interesting'

vladthemagnificent
Автор

"Stumbling into something interesting, rather than actually being competent"
I think you just summed up honestly the creative process of every artist :P
Maybe that's an insight into why these models (in their current forms at least) can work so well as idea-generators and starting points.

TheStigma
Автор

i always found the deepthink button hilarious

shadmium
Автор

This is dumb.
I know we are already abusing chatbots for something they were not even designed to do, but the biggest issue here is having a _single_ conversation with each.

Sometimes both Chet Jippity and DeepSeek just start off on the wrong foot.
Like, you'd ask "what's 1 + 1" and _most_ of the time they answer 2, but sometimes it's wrong. Which is what I feel happened to _both_ of them here—but especially DeepSeek.

And the important thing is, you _need_ to start a new conversation if the very first answer it gave was totally, catastrophically bad. I believe the logic behind it is that if the AI looks at the history of the conversation, and sees that it was dumb and had to be corrected or reprimanded, then it will _keep_ playing that role as that is statistically way more likely to happen than the "person" suddenly becoming a genius.

*My suggestion, * if you're gonna make another such video in the future:
- try at least 3 (preferably 5) conversations each, sending them the exact same message
- play them all and quickly compare them, choosing only the best ones of each AI
- continue the rest of the video as you would

hundvd_
Автор

I don't know how to read music or what half of the names you said were but I was really intrigued
I would love to see more things like this

TobsterMK
Автор

I broke R1 last with a question about modes and using Nashville numbers and Roman numeral analysis… it recursively puked for 5 min and invented/hallucinated five additional letter classes for the Nashville number system… 😅

michaelvarney.
Автор

The beginning of the first few songs by Deepseek is actually really nice but then completely drops the ball. I could see using it as an interesting starting point in my head at least, although I know nothing about music really

nonetrix
Автор

Fun but not sure what the point is when you have actual AI music creation engines tuned for this purpose that can actually do meaningful output like Udio and Suno

d.d.jacksonpoetryproject
Автор

That Image to music video was genius and I'd love a copy to that repo. Soooo good!

Edit: Is it on the Patreon?

RengokuGS
Автор

The internal thinking is really fascinating to be able to watch fully play out IMO. Yes - it does reveal quite a bit of the "stupidity" of AI models that can otherwise seem a bit like magic, but it also allows you to identify some core misunderstandings it may have that prevents it from producing a better answer, or at least focusing it's efforts on other things. This can often be very basic things like not being quite sure about the context of your question. If you then clarify or correct some facts that it uses to arrive at the answer and ask it to reevaluate with the new info, it will usually help massively in the output quality. A system that can integrate this feedback and corrections into further learning refinement becomes self-improving by crowdsourcing. It is much harder to spot reasoning issues when this process is opaque.

This is already a big issue in open projects like Deepseek R1 vs proprietary like GPT. Giving away the internal though process means other can distill the model and train on it very efficiently, so proprietary systems won't want to give you that insight. It's obvious why there would be a need for this from a capitalist angle, but it will also mean missing out on a great amount of natural learning and better understanding between humans and the model - which seems pretty essential for improvement of the technology as a whole. Yes, you can skim the whole internet for info and learn a lot, but that's not nearly as useful for self-reflection as direct human feedback and commentary (I think... as someone who admittedly doesn't make LLMs :P )

TheStigma
Автор

Good point about open-ended instructions producing "better" results. The less constraints you put into the request the better the model can usually extrapolate from it's training something that "fits together" quite well, and thus is often perceived as "higher quality" or "more natural". The downside is obviously that the output will be more random in scope, and potentially produce things that aren't relevant. Always consider what constraints you actually need initially - and add further constraints as you narrow down and refine the idea. Over-instruction is a very real thing.
The funny thing is, this applies just as well to something like image creation or co-writing a fictional story. It applies quite broadly to AI models in general.

TheStigma
Автор

Pretty random stuff - you know what they say: If you make a thousand monkeys type on typewriters... So, if you're patient, it may produce something worth hearing eventually.

N-JKoordt