An honest review of Devin AI

preview_player
Показать описание
its been 24 hours since I've gotten access to Devin, the world's first fully autonomous software engineer. in this video, i wanted to give a detailed review on Devin's strengths and weaknesses. enjoy!

by the way, if you want to try out one of the apps that Devin built for me, here it is:

Рекомендации по теме
Комментарии
Автор

Chapters (Powered by ChapterMe) -
00:00 - Devon AI agent that claims to be worlds first fully autonomous
02:38 - Devons app turns world into museum
03:30 - Devon App builder with questions, planning, updates
05:20 - Android app asks users for help
06:08 - Deployment time 134
08:29 - Twohour CSS change to add features
08:51 - Devon walkthrough reveals power of commands
09:25 - Devon Excellent prototyping agent, impeccable UX
12:49 - AI software engineer Devons slow performance
14:59 - Devons powerful features, sign up now
15:22 - Lifted access for players

danecjensen
Автор

I just received my invite to Devin. The cheapest plan offered "Personal (Devin Lite) users receive early access to Devin Lite for $50 / month, which includes 65 Devin Lite ACU / month built in. Additional Devin Lite ACUs can be purchased at our standard unit rate of $0.8 / ACU. Currently, each ACU is approximately equivalent to 10 minutes of active Devin Lite work." I'm having difficulty finding more information, but it seems to me, for $50 I get 650 minutes of computing. Looking at the lengths of time reported by Zack this seems like a very poor offer.

indigo
Автор

Are you sure it is not like the Amazon AI, a bunch of real people behind the scenes hahaha, it seems too slow for an AI

o__sama
Автор

Awesome to see you're back, love the overview

ryanlisse
Автор

I would love to see a video of Devin working in an existing project.

arvinddhindsa
Автор

I'd love to see it contribute to an existing code base maybe try it on projects of different sizes/complexities

youssefwalid
Автор

Did you review the code to look for flaws in its functions logic? Did you extensively test the results it produced to make sure it was not bugged? I have yet to see a LLM produce code that isn't flawed. Either by being it being stupidly overengineered, really poorly structured in terms of performance hits (like for loops meant for huge lists with DOM calls within things like dragover, not using any cached data), or simply not covering all use cases. I also see LLMs typically get stuck on a "solution" they think is correct, even if you tell them to start from scratch, and your only option is to initiate a new instance of chatting with them to make it change their approach. LLMs also seem to lack the capabilities to grap simple logic that is observable to us humans, like try to ask it to figure out the next number in a sequence of numbers and sometimes it will get it right, but once you start getting more complicated, it will go completely off track and not understand the basic observable logic. This is obviously why ChatGPT and so on produce such flawed code usually. How did Devin perform in terms of actual good code? With good I mean stable, bug free and performance-focused. I couldn't care less about readable code if AI is writing and able to parse it. Speaking of, can it parse complex code and understand how it works? LLMs usually in my experience can manage to grasp the overall use for the code it is asked to analyze, but won't understand (again) some of the logical reasons for certain parts of the code. Unless Devin can produce good code, I see it as no better than any other LLM option.

nustaniel
Автор

Thank you, hojestly wild how this is the only review on devin, the rest are just predicitions lol.

oo--
Автор

fingers crossed that I get in and try it out.

TenseiCho
Автор

hasnt aged well?

so far nobody could really show it doing anything really

EddyLeeKhane
Автор

btw, i am still have no chance to use it, could you tell me how do you get this access to use?

lilian-ud
Автор

So maybe it felt different using it, but it looked pretty horrible. I think most devs with Co-Pilot could do this infinitely faster. Not to mention remember the solution and re-impelement similar projects very easy in the future.
The entire promise of Devin was a done for you software engineer. This looked horribly ineffficient. And if it needs this much input from a dev... Then why can't the dev just use co-pilot to implement it himself?

wonderfulworldofmarkets
Автор

I'm the only one who doesn't have access?

slavaprotv
Автор

Hi, Zack! How quickly did you get off the waiting list?

МаксБоровой-фо
Автор

Great overview. I really appreciated this video.

Danefrak
Автор

Why everyone who got access to Devin never shows real time interaction with Devin? Probably because it will revile how capable this ChatGPT wrapper

Coder.tahsin
Автор

There were 2 things I wanted to see about Devin:

How smart - which I guess is not amazing? I'm not sure how buggy the final product is, but it seemed to me like it ran into issues and required intervention.
In the end, the main thing I guess is that it can do stuff, but if it's not as smart as Claude or GPT, then I might as well just copy and paste from the smarter llm instead of waiting for a dumber one automatically do it for me
Basically, if you need to solve a coding problem, it does not seem like Devin is the way to do it

How big is the context window - not sure from this I guess. Current problem with llm is that it's hard to have an entire project as context, so you have to find where to fix/add something and give them the info. I doubt Devin solved this, so I kind of want to see it given/generate a big project (at least bigger than the regular llm context windows) and told to fix something and see how it handles that.

If those 2 things fail, then Devin is more or less a convenience thing - an AI that automatically runs what it generates, reads the error, and reprompts itself. I mean, these things already existed with AutoGPT and other stuff for a while now, so I'm not too invested, especially considering I can just run the llm generated code myself and give them the errors.

So basically, it seems to boil down to convenience. As you said, it could take hours and give out a terrible buggy mess, but at least you didn't spend time on it. But if you truly want an actual product, it seems using smarter llms is still the way to go.

cubed.public
Автор

So if i wanna be developer I'm fucked, ai can do all for me

axelvirtus
Автор

great video, the fan noise is slightly annoying though :-)

sepiaflux
Автор

Davis Michael Martinez Margaret Thomas Barbara

SiyaMoni-qv