What is an LLM Router?

Показать описание

In this video I take a look at a new open source framework and the accompanying paper from LMSys for helping you to automate LLM selection based on the input query.

🕵️ Interested in building LLM Agents? Fill out the form below

👨‍💻Github:

⏱️Time Stamps:
00:00 Intro
01:15 LMSys RouterLLM Blog
03:24 RouteLLM Paper
03:46 RouteLLM Github
08:31 RouteLLM Hugging Face

Рекомендации по теме

Комментарии

Great opensource release, thanks for the video

RolandoLopezNieto

Great work and excellent explanation. Thank you

bayesian

Haha.

Last night I was chatting with someone about how to come up with data to solve this problem. I have a story writing engine that can blow through blow through $10 of tokens in minutes. It is getting really expensive just to develop it.

This morning I was going to look around to see if anybody had something like this.

And the solution to my quest is in the first video I watched this morning.

I hope the rest of my day is this awesome.

JohnBoen

Wow, this is great that it was released with the entire framework open source, as I believe that this (or something like it) will be part of the interface we will all be using soon. The other component is determining what data is required to respond. For instance, does the query require proprietary or personal data? This would first create a context (through RAG) for that data but also determine which LLMs would be available to that context based on the required security (do you even want to send the proprietary data to a commercial LLM?). Also with Llama3 8B, this could be done locally (at almost no cost). BTW, this is part of the framework that Apple will implementing, but can be tailored for many other applications now using this framework and LangChain (for instance).

toadlguy

Would be interesting to see how this and MoA (mixture of agents) could be used together.
Perhaps the route could go to a different model that uses several smaller agents (models) together, medium agents together, and larger agents together and/or mixed with smaller agents

jeffg

Great one Sam. So, to make this all about me :-), I've been using GPT4x as the router/manager under the theory that it is the smartest (this is a Mixture of Agents). Then the agents are cheaper. I can see this is much better. Thanks!

KevinKreger

Good one. Really it will help enterprises to save cost.

SanjaySingh-gjkq

I'd like to see more examples of applications of LLMs.

thirdreplicator

this might actually work real well in test scenarios, ie which llm provides the best accuracy vs speed compromise, for example in rag- / knowledge-graph systems

themaxgo

Worth comparing to how well it performs vs the semantic router lib which is also free to use

wumenfd

It's a good idea, but not practical in real life apps.
I use multiple LLM for my app, and i manually test them first and make sure the weaker model are suitable for my task, then i route each of different tasks to different LLMs based on intensive test results.
I am unsure how or where this router AI would be useful.

hqcart

This is actually ery interesting. Concretely, when you use langchain and has satically linked LLMs on some custom tools, how could we redirect this from langchain directly from langchain so the routing is made afterwards ?

AdrienSales

Claude 3.5 Haiku with this framework is gonna be insane. Nice video as always !

lydedreamoz

I don’t know whether this data oriented way of evaluating where to direct a query is going to be better than a task based one. For my app, it would be far easier to route for summarisation vs data extraction tasks, vs other tasks

nickludlam

Can we have a code example of this using langchain since its the most common framework people use for LLMs please

yazanrisheh

Can this pair a local model to a cloud LLM and be even cheaper? Would love to see with the new generations of phi and Gemma

VastCNC

Haha how is this new? I started doing this about 20 mins after trying GPT-4. I appreciate the formal framework and improvements they've made tho. That said, I use GPT 3.5 to filter first. And yeah, saved me a ton of money. Not only is it the cheaper model but using simpler (short) prompts. Like "respond IGNORE if this message is not asking for a response." Then I'll only send the messages to GPT4 with a full prompt for messages that need responses. Use a simple model first. Save tokens. Save 20-50x of LLM costs (my usecase). Profit.

Also worth noting that ChatGPT has something similar. People have long known that some responses get routed to gpt 3.5 vs 4.0+.

jarail

What about the latency impact? Wouldnt this preclude a lot of production use cases

amj-

What about "function calling"? Can you really move between models?

not_a_human_being

Flash is a better and cheaper reranker than the rerankers in the market (including Cohere)

bastabey

What is an LLM Router?

What is an LLM Router?

LLM Routers Explained!!!

Router Chains with Langchain

Routing with LangChain - Basics - Semantic Routing vs. LLM Classifier

LLM Router and Gateway - Martian

Faster LLM Function Calling — Dynamic Routes

How LLM Routing Can Help You Save 97% of Your GPT-4 Bill 💸

Semantic Router: No more rogue LLM chatbots?

AI is TOO EXPENSIVE if You Don't Do This

RouteLLM achieves 90% GPT4o Quality AND 80% CHEAPER

a very simple llm model router(code in description)

NEW AI Framework - Steerable Chatbots with Semantic Router

Llama Index Concept Overview: LLM-powered QA Techniques with Routing Abstractions

RouteLLM Tutorial - GPT4o Quality but 80% CHEAPER (More Important Than Anyone Realizes)

Coining the term LLM Router - Pioneering in #ai

Better LLM Architecture with Semantic Router and Function Calling

LLM Evals for Router Based Architectures

LLM routing explained with 3 examples, simple to advanced

RAG from scratch: Part 10 (Routing)

Mixture of Experts LLM - MoE explained in simple terms

Supercharge LLMs with Semantic Routing 🤩

4g роутер (sim модем) mifi benton - обзор с aliexpress

Unify (YC W23) - Building LLM Router with Daniel Lenton

OpenRouter - Use The LLM Inference API with the Lowest Cost