ChatGPT-O1 Changes Programming as a Profession. I really hated saying that.

preview_player
Показать описание
AI's have not been able to do as well as professional programmers up until now. For a small subset of entry-level programmer jobs, that's no longer true (at least until the new hire becomes more experienced).

It still doesn't have the judgement a programmer needs (or anything close to it), and it can't do the communication parts of the job (and is still awful at writing tests). But it's starting to live up to the capabilities of this form of Machine Learning.

If your only programming experience is homework assignments, or If you're programming workflow primarily consists of copying and pasting from Stack Overflow, then I've got some bad news for you.

00:00 ChatGPT-O1 is different
02:00 What I've been testing
03:03 Coding RegExes is Rust
05:19 Incorrect Assumptions and Expectations
06:22 Rechecking ChatGPT-4o. Still bad
06:59 The specific case where it might replace a programmer job
07:11 But, it's a stupid job that shouldn't exist
08:03 Many C.S. programs don't prepare for useful skills
09:24 O1 Still has a lot of limitations
10:48 Just as bad at tests as Claude 3.5
11:16 Still bad at picking between alternatives
11:51 O1 couldn't even make this compile
13:40 "Human-like reasoning" claim is nonsense
14:22 Don't panic - tools obsolete skills every few years
15:23 But, things are going to be bumpy
15:35 Advantage to smaller, nimble companies for a while
16:01 Still going to be plenty of programmer jobs
17:35 But you need to learn how to use these tools, and well

References:

CodeCrafters (affiliate link) if you want to see how you measure up to 4o:

That nightmare regular expression:

My video with Proof News on the AI judgement problem, and Claude 3.5's poor test writing:

My video comparing 4 AI tools using CodeCrafters' HTTP in Python challenge:

My video on Agile paving the way for AI to hurt programmers:

My video on Software Testing in an AI world:
Рекомендации по теме
Комментарии
Автор

this is the kind of video that makes me drop everything and watch, instead of saving it to the "watch later" playlist :D

n-o-i-d
Автор

wait the "LLMs suck" guy says "o1 is good"? Oh man, I have to watch this. :)

CherryBlossomStorm
Автор

The only code i trust ai for is code i could write myself but save an hour actually coding it by spending 5 minutes detailing what exactly i want. Give it no room for interpretation. It is horrendous when it comes to any actual problem solving

graydhd
Автор

"We've reached the point where every programmer needs to know how to use AI tools". so brutal, times are changing. it's happening

randomamplifierJunior
Автор

I mean who wants to do the boring stuff of solving problems and connecting ideas together? What software developers really want to spend their times on is on cleanup and debugging code they have not written. So awesome that AI is freeing us up to focus more on that stuff for the rest of our careers!

sharkysharkerson
Автор

We hire entry level people not for what they can do, but for what they can learn to do, is the potential what we seek.

reinventing_the_circle
Автор

I'm not a developer but I've been a manager in big tech. You don't need to replace 100% of your workforce with machines/ai, what you need is to replace 70/80% of it so that the human 20% can supervise them.
Devs won't disappear but like blue collar jobs they will become less important the more the tech progresses

manueldl
Автор

Boy that title got me in here right quick! I really appreciate the time you put in benchmarking these LLMs. You're a voice I actually trust to see beyond the hype.

An interesting time to be alive for sure. It has me wondering how much more juice we can still squeeze out of transformer based models.

I guess at this point the race is on to create a system that allows you to feed the right context in a codebase to the LLM so it can actually work off a senior dev's instructions faster than it would take that dev to write the instructions.

At that point there's definitely a business case for more myopic projects/companies.

In the meantime, let's rejoice in the fact that leetcode interviews are dead at last!

matthiasroshardt
Автор

This is such a weird time. Some people are saying it's all hype and there's nothing to worry about while others are saying dev jobs, at least new dev jobs, will basically be obsolete in a couple of years. Obviously nobody really knows but man it feels like a gamble to go down this career path right now.

chalkfarm
Автор

As a data scientist, coding is not my top priority, but my coding learning curve has increased dramatically due to AI. No more choosing the right google link to hunt for an answer. Bam! Answer and good explanation of a chunk in 2 seconds. Still need the knowledge to know when something is wrong, but it's pretty accurate for learning basic/intermediate skills and explaining concepts. I'm learning much, much, faster.

vlnzt
Автор

I was blown away when I tried cursor/o1 at first. It was so awesome, I was literally programming entirely through prompts and it felt pretty slick.

Then…the code base started to grow. By the time that I had several thousand lines of python, I noticed a significant drop off in quality. I think context is still LLM’s Achilles heel.

farmtech-vnew
Автор

Translation: ChatGPT-O1 gets good at artificial jobs created to hold on for a young talent pool until they mature somewhat and start to become actually useful... pretty much nothing else changes. So big companies will be tempted to "replace" people and stop holding young talent pool, which will dissipate and (hopefully) join smaller companies which actually need you to DO stuff. It's going to be harder to be hired for your LeetCode skills.

darkarchon
Автор

This is a bummer. Even if jobs are still safe in the future(I have my doubts), I really don't like the workflow of begging/coercing the blackbox into producing the code for me. Wish I was born half a century earlier and was ready to bow out now. In the past I thought it'd be exciting to see the rise of robots, now a solved world just seems sad, bland, and boring.

iverbrnstad
Автор

as a developer for 5+ years, I don't understand how people use AI to write code from people who make these kind of videos... I would really love to see the prompts and the process for you to reach those conclusions.
on a side note, I remember the first iteration of chagpt 4 that was very slow but it made really good code, then from every new update, it was reaaaally bad. specially chatgpt 4o.
I would love to see a video of how you would make great software and then compare that to an LLM and compare the differences and also and most important, with a real use case, not a shitty todo app or snake game or anything other than a greatly and useless application with tons of documentation.

gazorbpazorbian
Автор

It is interesting that CharGPT-o1 made exactly the same mistake as Claude-3.5 when coding the test case. They must both have the same flawed example of testing whether a circle had been drawn in their training data. Even more disturbing is that the o1 chain of thought + reinforcement learning didn't suppress the weighting of that training example. It implies that the CoT+RL approach is only reinforcing paths that arrive at source code that produces the result specified in the prompt, but not for arriving at source code robust to adversarial cases implied by but not specifically mentioned in the prompt. As you said, it lacks the initiative to dream up all the different ways a bug might fool its flawed test code into returning a false pass.

bornach
Автор

I don't fear that AI will replace juniors, but i think that it will make them not as needed as before. and me as a self-taut programmer; I'm extra screwed for this. bc if a company that have the option to hire someone with a digree vs someone without one, I think the choice is obvious.

mojtabaazeez
Автор

Also reasoning in math/physics tasks is getting spooky. I teach mechanical engineering in a top 10 ranked uni and I have a few physics questions that use "basic mechanics", are not complex to formulate but most PhD students in mechanics took a long time to figure them out. I used them to bench LLMs and all models so far, 4o but also sonnet 3.5 answered them with absolute garbage answers, far worse than any first attempt by any phd student ive seen. O1 just thought for a minute and then aced it and even provided the most elegant explanation I could think of myself. I am not sure what to think. This must have "an end"? I can't believe it will improve that much more anymore for a while now...

amarug
Автор

Much respect to you changing your mind brother. It takes guts to actually take in new information and change your viewpoint. A lot of people will not even admit they were somewhat wrong

arnavprakash
Автор

Juniors are not hired to solve let code problems. They are expected to grow up in the field and become seniors. Actual seniors will retire at some point.

bestopinion
Автор

o1 probably has publicly available solutions from codecrafters at this point

yzhishko