Extending LLMs - RAG Demo on the Groq® LPU™ Inference Engine

Показать описание

Getting even more wow out of your Open Source LLMs!
Retrieval Augmented Generation (RAG) is a more and more common approach to address some of the limitations of LLMs. If you're looking to leverage proprietary, organization-specific information in your LLM implementation, this is definitely a technique to be familiar with! In this five minute demo we show how RAG can alleviate some of the LLM limitations and show it working in action.

#artificialintelligence #machinelearning #demo #ai #llm #llama #genai #groq #lesson

Groq

Рекомендации по теме

Комментарии

Check out #GroqSpeed feels by going to our website, www.groq.com where you can experience the world's fastest LLM inference for open source foundational models. You can also apply for API access if you're a developer.

GroqInc

Could you make a step-by-step tutorial on how to make a RAG using Pinecone and Groq? Very interesting tutorial anyway.

Machiuka

Please make a step-by-step tutorial on this. I have been testing groq chat and amazed by the speed!! Would like to include this tech into my MS project

gazzalifahim

Hi ! Could we have more informations about the time needed to retrieve the right document (this retrieving phase is made on your lpu right?) and how much time does it take to the llm to generate the answer ? Thanks in advance I love your work !

PierreDelesse

as far as I've understood custom gpts use RAG, am I wrong? Maybe they are not super flexible but GPTs are using RAG and compression already when you upload information to the custom knowledge base to get info from a vector storage, a web crawler, an interface to integrate API tools and so on...

disruptiveS_

Hi, Will the Groq API support 'Function Call", especially parallel function call ?

BIGBICAI

What's the joules per token? If it's using 14 nm instead of 3 nm chips, it's hard to see how it's feasible since the energy use at 3 nm is already really high.

heelspurs

how does RAG compare to traditional Knowledge Bases ? The way I understand it is that KB's are static while RAG's are dynamic ? if so how does this actually work under the hood (APIs?). Very interesting but too many questions left unanswered.

awakenwithoutcoffee

I want to know if you'll host fine tuned models in the future.

JameelaZain-qiwk

Problem is: I need RAG and function calling at the same time. And I would love to use groq because of its speed. Is that possible.

gerdberlin

How do we get access to the API? I can't afford the hardware but I'd like to pay to use the API and I can't figure out where to sign up

dad

Is RAG basically an alternative to infinite context window?

yorgohoebeke

You lost me when you said BBC is a trusted source

usernamesSux

Extending LLMs - RAG Demo on the Groq® LPU™ Inference Engine

Extending LLMs - RAG Demo on the Groq® LPU™ Inference Engine

Build a Large Language Model AI Chatbot using Retrieval Augmented Generation

What is Retrieval Augmented Generation (RAG) - Augmenting LLMs with a memory

Building Production-Ready RAG Applications: Jerry Liu

Python RAG Tutorial (with Local LLMs): AI For Your PDFs

Build a Retrieval-Augmented Generation Chatbot in 5 Minutes

Intro to RAG for AI (Retrieval Augmented Generation)

Build a RAG Based LLM App in 20 Minutes! | Full Langflow Tutorial

Revolutionizing Data Analysis: Combining LLM, RAG, and Domain Knowledge

How to build Multimodal Retrieval-Augmented Generation (RAG) with Gemini

RAG Implementation Medical Chatbot with Mistral 7B LLM LlamaIndex GTE Colab Demo

Vector Search RAG Tutorial – Combine Your Data with LLMs with Advanced Search

Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)

LangChain Explained in 13 Minutes | QuickStart Tutorial for Beginners

AWS re:Invent 2023 - Use RAG to improve responses in generative AI applications (AIM336)

Session 7: RAG Evaluation with RAGAS and How to Improve Retrieval

LLama 2 + PEFT Docs: CODE interactive LLM w/ RAG

ADVANCED Python AI Agent Tutorial - Using RAG

Jay Alammar on LLMs, RAG, and AI Engineering

A Survey of Techniques for Maximizing LLM Performance

Demo for LLM powered X Search assistant chrome extension

Retrieval-Augmented Generation chatbot, part 1: LangChain, Hugging Face, FAISS, AWS

Advanced RAG with Knowledge Graphs (Neo4J demo)

Building Multimodal AI RAG with LlamaIndex, NVIDIA NIM, and Milvus | LLM App Development