DeepSeek Coder AI 🐍 The Best Coding Model I've Tested? (Open-Source)

preview_player
Показать описание
In this video, we use a new coding rubric to test a coding-specific LLM called DeepSeek Coder. It's an incredible model that is fine-tuned for coding tasks, and today, we're going to see if it's as good as they say.

Enjoy :)

Join My Newsletter for Regular AI Updates 👇🏼

Need AI Consulting? ✅

Rent a GPU (MassedCompute) 🚀
USE CODE "MatthewBerman" for 50% discount

My Links 🔗

Media/Sponsorship Inquiries 📈
Links:
Рекомендации по теме
Комментарии
Автор

What tests should I add to the new coding-specific rubric?

matthew_berman
Автор

Yes, please. A tutorial for LLM in VS Code would be great. How do we deploy them, local Vs cloud and how would a basic workflow look like? Thank you 😇

Автор

I love it!! Snake works perfectly! The 7b model is also impressive! Reaching also pretty close to GPT4 for a 7b and its also capable of creating Snake Locally!

CronoBJS
Автор

Please make the tutorial, this looks like a ton of fun!

royalcanadianbearforce
Автор

For formatting the !=, i think the term is ligature and different fonts can include them.

Ligatures are special characters in a font that combine two (or more) troublesome characters into one. For instance, in serifed text faces, the lowercase f often collides with the lowercase i and l. To fix this, the fi and fl are often combined into a single shape (what pros would call a glyph).

kurtesimo
Автор

Insane. What parameter model did you use for the test, you didn't say I think?

EDIT: If it was the 33B then I think it would be worth the hassle to test the other models or at least the 3B or 7B because that's I think what pretty consumer level hardware can run pretty easily

metafa
Автор

Super cool! +1 for a tutorial on the vscode integration, is possible to use a local LLM for coding without the cut and paste in the files?

luigitech
Автор

a local copilot would be wild if you can get that to work!

theresalwaysanotherway
Автор

Hey Matthew, Really enjoy your videos. Just a question on the code challenge section. If you change the engineer prompt to "Please find the issues with this code and explain in detail: <code>". You will find the outcome to be more inline with expectations.

KevinMcNamara-hd
Автор

I appreciate these overviews, and this is great to see! I teach undergrad CS and before teaching full-time my work was in creative text generation, creative agents, and using AI to support creativity (glad I defended before GPT-2 dropped). I've been hoping to find coding models we could run on our local servers for (at least a subset of) students to eventually use, but don't have the time to keep up this the wild progress without videos like these. +1 for sure on VS Code usage, anything on deployment and practical workflow integration is fantastic. Great vid as always!

Oh, and in terms of some niche use cases for different kinds of code benchmarks, beyond generation, code reviews, checking if requirements are met or violated, and checking test case coverage have all been almost fully hallucinations in the limited tests I've done, so these are always appreciated (assuming the model is meant to support instruct)

connorhillen
Автор

Thank you for doing that multi-turn error testing, that's what all coding LLMs testing should go into!

SinanAkkoyun
Автор

Awesome video, please make a video about using an open source llm as a copilot on visual studio code. That sounds very interesting.

kamelsf
Автор

The ≠ formatting is a rendering option. VS code supports it. Under the hood it's still the same text.

thenoblerot
Автор

The score-results vs GPT-3.5 and GTP4: I'm guessing all the scores are for un-quantized models? It would be interesting to see how the scores were for all different quantized versions of the open models too.

frankjohannessen
Автор

So what's next on the game test front? Breakout? Space Invaders?

Djungelurban
Автор

You mentioned Phind and I wanna make sure you know that while Phind did release an open model Phind/Phind-CodeLlama-34B-v2 but later iterations have all been proprietary and behind a commercial cloud offering. So when people say "phind is the best" they generally aren't talking about Phind/Phind-CodeLlama-34B-v2 rather they are taking about the proprietary commercial offering.

CognitiveComputations
Автор

As a programmer with 20+ years of experience I can say that now I'm starting to become impressed. The code (in this case quick sort) identification element is extremely valuable on its own. Finally an local AI model has as much smarts as a high schooler.

NOTNOTJON
Автор

Yes please to the VS code open source code pilot. Folks working behind a corporate firewall would be rescued by that!!! 🙏

simonmassey
Автор

I have a suspicion that the snake game was in their training dataset, that's why it always nails it.

stickmanland
Автор

This was one of the better code testing processes I've seen you do on model's. I don't expect most of them to get things on the first try but after some encouragement.

It would be great to see creative and effective ways to use open source llm models to build useful applications.

seancriggs