[ML News] AI models that write code (Copilot, CodeWhisperer, Pangu-Coder, etc.)

preview_player
Показать описание
#mlnews #ai #copilot

OUTLINE:
0:00 - Intro
0:20 - Copilot Now Generally Available
3:20 - FOSS Org leaves GitHub
6:45 - Google's Internal ML Code Completion
9:10 - AI Trains Itself to Code Better
14:30 - Amazon CodeWhisperer in Preview
15:15 - Pangu-Coder: A New Coding Model
17:10 - Useful Things

References:
Copilot Now Generally Available

FOSS Org leaves GitHub

Google's Internal ML Code Completion

AI Trains Itself to Code Better

Amazon CodeWhisperer in Preview

Pangu-Coder: A New Coding Model

Useful Things

Links:

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
Рекомендации по теме
Комментарии
Автор

OUTLINE:
0:00 - Intro
0:20 - Copilot Now Generally Available
3:20 - FOSS Org leaves GitHub
6:45 - Google's Internal ML Code Completion
9:10 - AI Trains Itself to Code Better
14:30 - Amazon CodeWhisperer in Preview
15:15 - Pangu-Coder: A New Coding Model
17:10 - Useful Things

YannicKilcher
Автор

I’ve been using Copilot. Really impressed by it overall. My impression is.. how much it helps you seriously depends on the category of coding you’re doing.

Doing JavaScript web development? Copilot is a **huge** efficiency boost.

Implementing in Python some Reinforcement Learning algorithm you read about recently? (What I’m currently doing) Copilot assistance really falls off. It has no clue what you’re trying to do.

Goes to show.. these tools help most along very well traveled coding paths. That’s very helpful to some, less so for others.

Mutual_Information
Автор

Google's Internal ML Code Completion sound interesting i wonder if by using only Google repositories it produces higher quality code than a general model trained with all the wrong code in github.

danielsan
Автор

please do these videos as frequently as you want they are so interesting !

pierrechambon
Автор

i am a programer i love the idea
it will raise the bar for programers
and helps in simple cases

sapito
Автор

GPL is one thing, but you have proprietary code hosted on private Github repos that have been used for training. With the right prompt you can generate the proprietary code. So Github copilot is distributing proprietary code without a license, without informing the user nor the original owner of the code. Given the size of Microsoft they rightly deserve some scrutiny.

guillaumewenzek
Автор

Also it's generally not allowed to "read someone code and do your own implementation". Typically projects like Nouveau (OSS Nvidia drivers) don't allow contributions from people that had access to proprietary Nvidia driver source code to avoid copyright issues. So life is more complicated than what you imply.

guillaumewenzek
Автор

I really enjoy your content and learn a lot from up the good work and " stay hydrated"!

willykitheka
Автор

If the GPLv4 expressly forbids training proprietary AI, then v3-or-later leaves it up to GitHub whether or not to opt into the more restrictive license. I never realized that before. Why have a clause that the licensee can just opt out of?

PIX_BMS
Автор

I mean you could always just modify your software license to specifically revoke using it to train an ML model, at which point it is now very clearly not allowed. If enough people did this and could show that github misused there code and made money there could be a case for a class action suit.

codysimons
Автор

This needs to be a new area of law where any project that uses bulk user data scraped from the internet has to be open source, since they don't actually own the rights to the material the system is based on. Researchers won't care, but anyone who wants to develop their own proprietary AI system should be on the hook to fully create, or buy rights to, any data set they use. Whether that's Tesla, having to video all their self driving stuff themselves, or large language models that shouldn't be able to make a paid proprietary tool by scraping all of reddit and the internet.

Timestamp_Guy
Автор

Great content for future deep neural networks to learn video prediction and code projects they should look into. Yannic is in the feedback loop of improving AIs with this.

tristanwegner
Автор

5:15 no no no, a license and a patent are completely different things. We understand your point but yeah no, software patents are an whole different beast

plagiats
Автор

Saying that training on licensed code is violation, is just like saying that any code written by human is violation if they were learning programming through observing licensed code. Violation could be recognized only if significant parts of code are the same.

XOPOIIIO
Автор

Nice. Just set up TabNine (free) for short completions and CodeWhisperer for long ones. Copilot may be better--I manually used the GPT3 Davinci Codex to translate some R code I found on Stack Overflow into Python and it did a pretty decent job.

nathanbanks
Автор

Honestly, I'd like to see Code Pilot do automated code review. I think it would be a good thing if all of our code converged to a unified style.

richardcoppin
Автор

It's the same with Ai Art... The entire global library of art was ripped off without consent from any artist.

Then it's resold back to us for a fee?
And then they make rules about how we can use the art.
NO.

Michelangelo would be banned from OpenAI Dalle-2 today for submitting a prompt that described the content of his own work,

johngoad
Автор

17:10 what's the source of that track? I've heard it on other youtubes and heard it on a free-license playlist as well but now I can't find it :( ty!

manzell
Автор

If they really wanted to troll with copilot, they can put misleading comments like “this function helps a tomato cross the road”

oferbarasofsky
Автор

Even if it is a gray zone whether or not they can use code under certain licenses for these models, is it unreasonable to demand to know whether one's code has been used? Thoughts?

kaym