Extending LLMs - RAG Demo on the Groq® LPU™ Inference Engine

preview_player
Показать описание
Getting even more wow out of your Open Source LLMs!
Retrieval Augmented Generation (RAG) is a more and more common approach to address some of the limitations of LLMs. If you're looking to leverage proprietary, organization-specific information in your LLM implementation, this is definitely a technique to be familiar with! In this five minute demo we show how RAG can alleviate some of the LLM limitations and show it working in action.

#artificialintelligence #machinelearning #demo #ai #llm #llama #genai #groq #lesson
Рекомендации по теме
Комментарии
Автор

Check out #GroqSpeed feels by going to our website, www.groq.com where you can experience the world's fastest LLM inference for open source foundational models. You can also apply for API access if you're a developer.

GroqInc
Автор

Could you make a step-by-step tutorial on how to make a RAG using Pinecone and Groq? Very interesting tutorial anyway.

Machiuka
Автор

Please make a step-by-step tutorial on this. I have been testing groq chat and amazed by the speed!! Would like to include this tech into my MS project

gazzalifahim
Автор

Hi ! Could we have more informations about the time needed to retrieve the right document (this retrieving phase is made on your lpu right?) and how much time does it take to the llm to generate the answer ? Thanks in advance I love your work !

PierreDelesse
Автор

as far as I've understood custom gpts use RAG, am I wrong? Maybe they are not super flexible but GPTs are using RAG and compression already when you upload information to the custom knowledge base to get info from a vector storage, a web crawler, an interface to integrate API tools and so on...

disruptiveS_
Автор

Hi, Will the Groq API support 'Function Call", especially parallel function call ?

BIGBICAI
Автор

What's the joules per token? If it's using 14 nm instead of 3 nm chips, it's hard to see how it's feasible since the energy use at 3 nm is already really high.

heelspurs
Автор

how does RAG compare to traditional Knowledge Bases ? The way I understand it is that KB's are static while RAG's are dynamic ? if so how does this actually work under the hood (APIs?). Very interesting but too many questions left unanswered.

awakenwithoutcoffee
Автор

I want to know if you'll host fine tuned models in the future.

JameelaZain-qiwk
Автор

Problem is: I need RAG and function calling at the same time. And I would love to use groq because of its speed. Is that possible.

gerdberlin
Автор

How do we get access to the API? I can't afford the hardware but I'd like to pay to use the API and I can't figure out where to sign up

dad
Автор

Is RAG basically an alternative to infinite context window?

yorgohoebeke
Автор

You lost me when you said BBC is a trusted source

usernamesSux