5 Problems Getting LLM Agents into Production

Показать описание

In this video I discuss 5 common problems in building LLM Agents for production

🕵️ Interested in building LLM Agents? Fill out the form below

👨‍💻Github:

⏱️Time Stamps:
00:00 Intro
00:58 Reliability
02:46 Excessive Loops
04:36 Tools
07:59 Self-checking
09:22 Lack of Explainability
10:09 Bonus: Debugging an Agent

Рекомендации по теме

Комментарии

crewAI creator here 👋
Thanks great feedback! We are addressing a lot of that internally :)
Great video!

BuildNewThings

This is hands down the best channel about AI. No hype-building or irrelevant tests. Just pure practice. Thank you!

MrKrzysiek

LLMs are nice for subjective stuff like creative tasks (which is the last thing we want to use them for) and for having fun with things like chatbots etc. For real world application that require concrete results and some overall robustness LLMs are like a time bomb. We are not there yet. There are people who trust LLMs can be part of a robust application which is what will cause the most issues with the so called AI in the long term. Skynet won't happen, stupidity will kill us first.

squiddymute

Can't wait to see the next videos about buildig LLM Agents with Langraph

tovanhai

Very honest and to the point about the actually status of multi AI agents to the date.

mrka_

This reflects completely my experience and I appreciate a lot your summary - thank you for all the effort!

I started first with CrewAI and ended up very fast with 10 USD on one day in API calls. Then I realized I cannot really debug it. Also, I missed a transparent / standardized AI DevOps incl. tests.

I'm currently experimenting with Langraph, but started wondering why to use a framework at all? Not there yet to decide.

Anyway, thanks again, looking forward to follow up with your journey 😊🤗

MartinBlaha

Thanks for the video, just confirms my thoughts after fiddling with langchain, Agents and Tools for the past year, the best thing to come out has been langgraph. Loved your video and you have a sub.

bpraghu

OMG your point is right on the spot! That's exactly the problem I had to deal with in my project

silvias

Amazing, thanks Sam! Huge fan, really loving and devouring your content. My first reaction was "ok good, it's not just me" 😅. But to your point, I think fine-grain agentic decisions with formatted/constrained outputs are key to mitigating some of those problems 💪. Even the idea (with the right models and speed/costs allowing) to have a few agents with the same task (maybe with slight prompt variations?) and then consensus/voting on the constrained output to the next agent is a decent approach.

jamesyoungerdds

I really like your channel but recently actually*watching* it has become a lot less enjoyable with all the nonsensical stock videos in the background. I would much more enjoy the classical talking head or even just a still frame over those stock videos that don't add anything to the actual content.

hawa

Sam, great video - I totally agree with the content. In my business (I work for a pharmaceutical company), the minimal reliability threshold is 95%. At the moment, I use agents as part of the experimentation process with AI, not for production. I think in a few months with GPT-5, they will work better and be more useful.

micbab-vgmu

I would suggest to introduce critics and SOPs to the agent operations. how do you think?

waneyvin

We have just implemented this kind of service. we needed 6 tools and two agents. not an easy job. lots of time was spent writing logic for errors.

midnightmoves

I‘m sure that one day it will simply work to write background stories, and a team that does whatever. But: currently, I can just have own functions, producing outputs with a single word, to react on how to use and/or work with the current user prompt.

Similar to dialects, models are not able to be prompted the same way. A Prompt delivers on models p * (n + m )^( n + m)-1 results. And this is just the input.

It is tedious, but a start.

And I agree to the decision point thing.

MeinDeutschkurs

Replacing your employee with an LLM is like hiring a virtuoso teenager on LSD.

shApYT

Indeed you are pointing a real issue. And with your permission I like too add one more. The tech ecosystem is currently hyped around large language models (LLMs), causing people to forget that many tasks used to be handled by smaller, task-specific models. The push from VCs and the hype around LLMs lead to the misconception that LLMs are needed for everything, creating immature products and numerous issues.

Most problems that people try to solve with LLMs and so-called agents could often be addressed with small models or even without machine learning. A better approach is to integrate LLMs as part of a larger system, allowing for incremental improvements as new research emerges, rather than building entire systems based on LLMs due to hype.

So, to add to your five problems, the sixth is: Don't build agents out of hype, and don't design entire systems on LLMs.

unclecode

Really insightful video, Sam! Glad to find your channel. Could you share your opinion on Haystack? I haven't seen a comparison between Haystack and LangGraph for real product development. Which would you recommend for creating scalable and complex LLM agents with more than 30 nodes?

khaledsaud

"Alright"? What happened to the iconic "Okay" intro? I even used it as a tribute to you in my demo video.

runmicteam

Really useful video, thank you. Is that website diffing tool open source somewhere? You reminded me I need something similar.

ChrisDermody

In some not so distant past there were bots called RPA bots very powerful for repetitive and predefined steps based processes. Excell for realability in stable context but lack for reasoning capabilities end none capable of any kind of self reflexion.dream was to have cognitive capability one day in the future maybe..
One day Generative AI with LLM and reasoning emerged . Suddenly automation industry started to thinks how automate work with this? RPA bots were forgotten. But Agent created with LLM started showing they limit in real business .. they allucinate and are reflect wrongly and to much some time were straight action is enough.. but then someone recall.. straight actions that exactly where RPA bots shine.
Roles for genAI LLM based Agent and RPA started to become clear.
Orchestration with Reasoning capabilities sound to be the place for Agent .
Actions or tooling devoted to RPA.
It's the dawn of a new era of possibilities. Where a smart combination of GenAi Action Orchestrator Agents & RPA bots could create best ever possible Business process Automations solution for corporates.

sergeziehi

5 Problems Getting LLM Agents into Production

5 Problems Getting LLM Agents into Production

95% Accurate LLM Agents | Shocking or Myth

5 rules for Building LLM Agents with AutoGen

Breaking Down & Testing FIVE LLM Agent Architectures - (Reflexion, LATs, P&E, ReWOO, LLMComp...

Harrison Chase - Agents Masterclass from LangChain Founder (LLM Bootcamp)

IQ TEST

Ep 5. How to Overcome LLM Context Window Limitations

How to Build, Evaluate, and Iterate on LLM Agents

AI Agents 2.0 MultiOn & AgentOps Hackathon: Sat. am intro livestream

Best Programming Languages #programming #coding #javascript

Which jobs will AI replace first? #openai #samaltman #ai

Importance of Height #3 (blackpill)

SBI EMPLOYEE WHOLE DAY |

GPT5 unlocks LLM System 2 Thinking?

LLM Explained | What is LLM

Chat with SQL and Tabular Databases using LLM Agents (DON'T USE RAG!)

what it’s like to work at GOOGLE…

Testing Stable Diffusion inpainting on video footage #shorts

NEVER buy from the Dark Web.. #shorts

5 LLM Security Threats- The Future of Hacking?

How to eat Roti #SSB #SSB Preparation #Defence #Army #Best Defence Academy #OLQ

How to Use ANY Local Open-Source LLM with AutoGen in 5 MINUTES!

Day in My Life as a Quantum Computing Engineer!

Supreme Court- Don't teach me J Sai Deepak - SC Judges JSD vs SC Judges