Introducing The New Champion of Function Calling!

Показать описание

In this video I go through the new open Tool Use / Function Calling model which come from Groq and Glaive and is based on the Llama-3 models.

🕵️ Interested in building LLM Agents? Fill out the form below

👨‍💻Github:

⏱️Time Stamps:
00:00 Groq Tool Use X (Twitter) Post
00:13 Groq Tool Use Blog
00:52 Berkeley Function Calling Leaderboard
02:36 Glaive AI
03:58 Code Time
14:27 Groq Hugging Face

Рекомендации по теме

Комментарии

There is no rest when you're in this industry. Always some part of the tech stack that's developed, some new feature. Thanks for covering the best bits!

thetagang

Wow! How exciting! Man you're my hero Sam. You are literally 8 steps ahead of the curve.

SirajFlorida

I wish they would release a mixture of agents option for people to use natively through their API. I have my own setup I can use, but I see a lot of people using LLMs who dont have the ability to do that.
Function calling has great utility, but any model can do this. If you give it the tool list with definition and the schema to use and give it a few examples in your messages array of a back and forth user and assistant messages that show the assistant using them in various scenarios most decent models will do really well with using them. In places where you're 100% sure it should be using at least one tool, you simply pair this with a function that just re-asks the same question recursively until you parse the response you know you're looking for.

mitchellmigala

From my limited testing, it's significantly more prone to hallucinations than gpt family of models that I've been using (it hallucinates argument values, creates argument values out of thin air, and even creates new functions). For my use case, even gpt-3.5-turbo and the vanilla version of llama3 that they're hosting is doing better on my custom evals than this new one, which is honestly kinda disappointing. I'm starting to feel like those benchmarks are not as good of a source of evaluation as they're wanting us to believe.

jcksn

I don't think they'll release the dataset, as Groq wants to keep it as a competitive advantage to increase their developer base. Anyway, you mentioned query rewriting, so let me share something. You know, from my actual production experience, it's too bold to release software with function calling without query rewriting. Recently, in a project where we needed function calling and tried many models, we faced unpredictability. Instead of fine-tuning those models, we fine-tuned GPT-2 specifically for query rewriting using synthetic data tailored to our case. And voila! Once we implemented that, all the nuances and unpredictability were gone. Query rewriting, either using a strong model or our approach, allows for effective use of many language models supporting function calling without fine-tuning the entire model. Like in your last example, with or without the keyword "search, " query rewriting is definitely one of the best steps in the pipeline.

unclecode

Thanks for the video, an interesting model. Am I right in thinking that what this model is good at is actually extracting data from a text to make properly formatted input data to tool calls, but weaker in making the decision to call a tool or not? Like you showed with your "(search) when do the olympics start" example, I was a bit surprised that a 70b model couldn't get that one. I see they also mention this in their blog post, a hybrid/routing approach. It would be interesting to see the benchmarks/performance if the models were allowed such a "reasoning layer" on top.

ringpolitiet

In my local testing, it seems Llama 3 8b is already pretty good for function calling (couldn't find cases where it fails)

Would be interesting to see in which function calling cases these high performing FC models succeed while Llama 3 from Meta fails.

tpadilha

we can still fine tune it further right?
would take make a difference?

sanchaythalnerkar

I think phidata does the best open source function calling

teddyfulk

I really dont understand why we need this? cant you just send a prompt to the LLM, "calculate this formula and return the result in json format
[ {
"formula": "",
"result": ""
} ]
why do we complicate things with a lot of text that 100% you will have typo somewhere and you will spend hours finding that typo, to achieve what exactly???

hqcart

This model is trash, I’m sorry but whoever did the benchmarking needs to be fired. It fails on every 3-4 calls quite regularly. It’s ok for super super simple function calls and it’s no better than the base Llama 3 model. Thumbs down on this model for me.

davidrobertson

Introducing The New Champion of Function Calling!

Introducing The New Champion of Function Calling!

Introducing the new UEFA Champions League format | Football Now

New Champion Arrives! Little Prince! (Unlock for Free!)

Ivern: Friend of the Forest | New Champion Teaser – League of Legends

Paladins - Champion Teaser - Lian, Scion of House Aico

Ekko: Seconds | New Champion Teaser - League of Legends

David Villa, The New Champion of Fun

Naafiri Champion Spotlight | Gameplay - League of Legends

Kindred: Listen to Their Tale | New Champion Teaser - League of Legends

'World champion of what?' Noah Lyles takes swipe at NBA players

Vel'Koz: First Contact | New Champion Teaser - League of Legends

Jhin: Mind of the Virtuoso | New Champion Teaser - League of Legends

Kled: The Reunion | New Champion Teaser - League of Legends

Fiddlesticks: Terror in Demacia | Champion Update Trailer - League of Legends

Aurelion Sol: The Star Forger Returns | New Champion Teaser - League of Legends

Samantha Irvin NEVER misses when introducing WWE Intercontinental Champion Gunther 🔥🤌🏻

Smolder Champion Spotlight | Gameplay - League of Legends

Briar Champion Spotlight | Gameplay - League of Legends

Braum: Trials of the Poro | New Champion Teaser - League of Legends

Kalista: The Pledge | New Champion Teaser - League of Legends

Sylas: The Unshackled | Champion Trailer - League of Legends

Milio Champion Spotlight | Gameplay - League of Legends

Pyke: The Bloodharbor Ripper | New Champion Teaser - League of Legends

Warwick: The Wrath of Zaun | Champion Teaser – League of Legends